Fitting Statistical Models to Data with Python

U-M Free Access Enrollment Options

What You'll Learn

Deepen your understanding of statistical inference techniques by mastering the art of fitting statistical models to data.
Connect research questions with data analysis methods, emphasizing objectives, relationships between variables, and making predictions.
Explore various statistical modeling techniques like linear regression, logistic regression, and Bayesian inference using real data sets.
Work through hands-on case studies in Python with libraries like Statsmodels, Pandas, and Seaborn in the Jupyter Notebook environment.

4 Modules

16 Hours

4 hrs per module (approx.)

Rating

About Fitting Statistical Models to Data with Python

In this course, we will expand our exploration of statistical inference techniques by focusing on the science and art of fitting statistical models to data. We will build on the concepts presented in the Statistical Inference course (Course 2) to emphasize the importance of connecting research questions to our data analysis methods. We will also focus on various modeling objectives, including making inference about relationships between variables and generating predictions for future observations.

This course will introduce and explore various statistical modeling techniques, including linear regression, logistic regression, generalized linear models, hierarchical and mixed effects (or multilevel) models, and Bayesian inference techniques. All techniques will be illustrated using a variety of real data sets, and the course will emphasize different modeling approaches for different types of data sets, depending on the study design underlying the data (referring back to Course 1, Understanding and Visualizing Data with Python).

During these lab-based sessions, learners will work through tutorials focusing on specific case studies to help solidify the week’s statistical concepts, which will include further deep dives into Python libraries including Statsmodels, Pandas, and Seaborn. This course utilizes the Jupyter Notebook environment within Coursera.

Skills You'll Gain

What You'll Earn

Certificate of Completion: Certificates of completion acknowledge knowledge acquired upon completion of a non-credit course or program.

Experience Type: 100% Online
Format: Self-Paced
Series: Statistics with Python
Subject: Data Science

Education
Platform: Coursera

Welcome Message

Fitting Statistical Models to Data with Python, a course that is part of the Statistics With Python series, builds on statistical inference foundations to help learners connect research questions with appropriate modeling techniques. You will explore linear and logistic regression, multilevel models, and Bayesian approaches using real-world datasets. Through hands-on labs and Python-based assessments, learners strengthen both conceptual understanding and applied data analysis skills.

This abbreviated syllabus description was created with the help of AI tools and reviewed by staff. The full syllabus is available to those who enroll in the course.

Course Schedule

Module 1: Overview & Considerations for Statistical Modeling

Video: Welcome to the Course!
Reading: Course Syllabus
Video: Fitting Statistical Models to Data with Python Guidelines
Reading: Meet the Course Team!
Reading: Help Us Learn More About You!
Reading: About Our Datasets
Video: What Do We Mean by Fitting Models to Data?
Video: Types of Variables in Statistical Modeling
Video: Different Study Designs Generate Different Types of Data: Implications for Modeling
Video: Objectives of Model Fitting: Inference vs. Prediction
Reading: Mixed effects models: Is it time to go Bayesian by default?
Video: Plotting Predictions and Prediction Uncertainty
Reading: Python Statistics Landscape
Video: Python Statistics Landscape
Ungraded Lab: Python Libraries
Ungraded Lab: Getting Started with Modeling in Python

Module 2: Fitting Models to Independent Data

Video: Linear Regression Introduction
Video: Linear Regression Inference
Reading: Linear Regression Models: Notation, Parameters, Estimation Methods
Video: Interview: Causation vs Correlation
Reading: Try It Out: Continuous Data Scatterplot App
Reading: Importance of Data Visualization: The Datasaurus Dozen
Video: Logistic Regression Introduction
Video: Logistic Regression Inference
Reading: Logistic Regression Models: Notation, Parameters, Estimation Methods
Ungraded Lab: NHANES Case Study: Linear and Logistic Regression
Ungraded Lab: Practice notebook for regression analysis with NHANES
Ungraded Lab: Week 2 Python Assessment Notebook

Module 3: Fitting Models to Dependent Data

Video: What are Multilevel Models and Why Do We Fit Them?
Reading: Visualizing Multilevel Models
Video: Multilevel Linear Regression Models
Reading: Likelihood Ratio Tests for Fixed Effects and Variance Components
Video: Multilevel Logistic Regression models
Reading: Link to the Cal Poly App
Video: Practice with Multilevel Modeling: The Cal Poly App
Video: What are Marginal Models and Why Do We Fit Them?
Video: Marginal Linear Regression Models
Video: Marginal Logistic Regression
Ungraded Lab: Fitting Multilevel and Marginal Models to Autism Data in Python
Ungraded Lab: NHANES Case Study: Marginal and Multilevel Regression
Ungraded Lab: Practice: Marginal and Multilevel Regression
Ungraded Lab: Week 3 Python Assessment

Module 4: Special Topics

Reading: Other Types of Dependent Variables
Discussion Prompt: Your Turn: Other Types of Dependent Variables
Video: Should We Use Survey Weights When Fitting Models?
Video: Introduction to Bayesian
Video: Bayesian Approaches to Statistics and Modeling
Video: Bayesian Approaches Case Study: Part I
Video: Bayesian Approaches Case Study: Part II
Video: Bayesian Approaches Case Study - Part III
Reading: Optional: A Visual Introduction to Machine Learning
Ungraded Lab: Bayesian in Python
Reading: Course Feedback
Reading: Keep Learning with Michigan Online

Grading Policy

Grades are based on conceptual quizzes and Python assessments. Weights range from 10% to 20% per assessment, with Python assessments comprising the largest portion of the final grade.

Brenda Gunderson

Lecturer IV and Research Fellow

Kerby Shedden

Professor

Brady West

Research Associate Professor

Course content developed by U-M faculty and managed by the university. Faculty titles and affiliations are updated periodically.

Intermediate Level

Completion of the first two courses in this specialization; high school-level algebra

Individuals

This experience is available to individual learners on the following platforms:

U-M Community

Free access is only available to current U-M students, alumni, faculty, and staff.

Organizations

Special pricing and tailored programming bundles available for organizational partners.

4.4

570 Ratings from Coursera

Most Recent Reviews

Read all reviews

October 31, 2025

Simplemente excelente

July 19, 2025

April 18, 2025

I think the course could have been better. There was little python here and a lot more videos about theory. I would have preferred having more practice with one topic or getting better at regression, than jumping around other topics that I am extremely unlikely to ever encounter.

September 15, 2024

The first half of the course is fantastic, but the second seems to go way too fast, not elaborating enough on the concepts, and not enough practice

May 23, 2024

Good course but lack of visuals and examples for illustrating complex concepts.

April 28, 2024

It was irrelevant and contained unnecessary content. Why are we drowning in theoretical statistical topics instead of focusing on Python? Thus far, the course has been more about statistics than actually working with Python! I am here to address my statistical needs using Python, not to become an expert in statistics. Unfortunately, this course seems to be doing just the opposite.

March 25, 2023

Week 3 and 4. Really painful. truly...truly...painful..

January 13, 2023

Like the other courses in this specialization, way too much theory covered, and the easy quizzes and labs give the learner a false confidence that he/she's mastering statistics. Instead, you grasp some of the theoretical knowledge, but not of the underlying math and therefore none of the intuition. The same is true of Python, all that's required is to hit the run cell button, no actual coding is required. The lecturers are super enthusiastic though, and the final week was fantastic. Mark Kurzeja should have his own course on probability and Bayesian statistics. Week 3 of every course has been super dense, and I think T Brady West should have his own course on sample design and weights because right now his lecturers drag down the overall quality of the course. It's all slides and text, math is brushed over and not enough of it is applied. Honestly, if you wanted to really get into Multilevel & Marginal Models you'd need 4 weeks. My advice, take the AP statistics course on Khan academy, watch some STATSQUEST on youtube & perhaps take the intro to statistics offered by Stanford University. You can also take this course/specialization and just skip weeks 3. You can probably pass the tests anyway Here's my rating by week. Week 1: 4* Week 2: 4* Week 3: 1* Week 4: 5*

December 19, 2022

I found the course to be good. I don't think it is excellent. Lectures can be a bit too long take some time to get to the point. Instructors are "ok", a lot of talking on most of them not enough math examples. Labs are pretty good but... I guess I can say that there are 5 star courses on this platform and this is not one of them. Its a solid 4. Still recommended.

March 21, 2022

too much material, way too little practical examples in python

Fitting Statistical Models to Data with Python

What You'll Learn