Your browser is ancient!
Upgrade to a different browser to experience this site.

Data Mining in Python

What You'll Learn

  • Understand basic concepts, tasks, and procedures of data mining.
  • Formulate real- world information using basic data representations: itemsets, vectors, matrices, sequences, time series, and networks.
  • Use data mining algorithms to extract patterns and similarities from real-world datasets.
  • Calculate the importance of patterns and prepare for downstream machine- learning tasks.
4 Modules
56 Hours
14 hrs per module (approx.)
Rating

About Data Mining in Python

In “Data Mining in Python,” you will learn how to extract useful knowledge from large-scale datasets. This course introduces basic concepts and general tasks for data mining. You will explore a wide range of real-world data sets, including grocery store, restaurant reviews, business operations, social media posts, and more.

You will learn how to formally describe real-world information with general data representations (e.g., itemsets, vectors, matrices, sequences, and more). You will then learn how to formulate data in the wild with one or more of these representations.

This course will teach you how to characterize and explain your data by looking for patterns and similarities, which are basic building blocks for advanced analysis and machine learning models.

This is the first course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.

Skills You'll Gain

  • Data Mining
  • Data Presentation
  • Machine Learning
  • Python For Data Analysis
  • Python (Programming Language)

What You'll Earn

Certificate of Completion:
Certificates of completion acknowledge knowledge acquired upon completion of a non-credit course or program.
Experience Type
100% Online
Format
Self-Paced
Subject
  • Data Science
  • Technology
Platform
Coursera
Welcome Message

Data Mining in Python, part of the More Applied Data Science with Python series, introduces core concepts and techniques for discovering patterns in complex datasets. Learners work with itemsets, vectors, matrices, and sequences while applying similarity measures and algorithms through quizzes and programming assignments focused on real-world data representations.

This abbreviated syllabus description was created with the help of AI tools and reviewed by staff. The full syllabus is available to those who enroll in the course.

Course Schedule

Module 1: Basic Concepts of Data Mining

  • Video: Welcome to Data Mining in Python
  • Reading: MADSwPY Certificate Roadmap
  • Reading: Course Syllabus
  • Discussion Prompt: Meet Your Fellow Learners
  • Reading: Help Us Learn About You
  • Video: What is Data Mining
  • Reading: Introduction to the Basic Functionalities of Data Mining
  • Video: Data Mining Functionalities (Part 1)
  • Video: Data Mining Functionalities (Part 2)
  • Video: Data Mining Functionalities (Part 3)
  • Graded Assignment: Knowledge Check: Basic Functionalities of Data Mining
  • Reading: Introduction to Basic Data Representations
  • Video: Representing Itemsets, Vectors, and Matrices
  • Video: Representing Sequences
  • Graded Assignment: Knowledge Check: Basic Data Representations (Part 1)
  • Video: Representing Time-Series and Spatial/Temporal Data
  • Video: Representing Graph Data
  • Video: Representing Stream Data
  • Reading: Case Study: Representations of Real-World Text Data
  • Graded Assignment: Knowledge Check: Basic Data Representations (Part 2)
  • Reading: Introduction to Patterns and Similarities
  • Video: Data Mining Based on Patterns
  • Video: Data Mining Based on Similarities
  • Reading: Introduction to Module 1 Programming Assignment: Visualizing Different Data
  • Role Play: Dialogue Reflection
  • Reading: Module 1 Optional Readings & Resources

Module 2: Mining Itemset Data

  • Reading: Introduction to Itemsets Representation
  • Video: Frequent Itemsets
  • Video: Counting Strategies
  • Video: The Apriori Algorithm
  • Graded Assignment: Knowledge Check: Mining Frequent Itemsets
  • Reading: Introduction to Module 2 Programming Assignment: Dealing with Itemset Real-World Data
  • Video: From Patterns to Association Rules
  • Video: Measuring Correlations Using Lift
  • Reading: Additional Interestingness Measures
  • Graded Assignment: Knowledge Check: Evaluating Frequent Itemsets (Part 1)
  • Video: Mutual Information
  • Video: Limitation of Correlation Measures
  • Graded Assignment: Knowledge Check: Evaluating Frequent Itemsets (Part 2)
  • Reading: Introduction to Itemset Similarity
  • Video: The Jaccard Similarity
  • Graded Assignment: Knowledge Check: Similarity of Itemsets
  • Reading: Module 2 Optional Readings & Resources

Module 3: Mining Vector and Matrix Data

  • Video: From Itemsets to Vectors
  • Video: Vectors and Matrices
  • Video: The “Vector Space”
  • Graded Assignment: Knowledge Check: Vector Representation of Data
  • Reading: Introduction to Module 3 Programming Assignment: Dealing with Vector and Matrix Real-World Data
  • Video: Vector Similarity Functions and Dot Product
  • Video: Manhattan Distance and Euclidean Distance
  • Graded Assignment: Knowledge Check: Similarity of Vectors (Part 1)
  • Video: Cosine Similarity
  • Video: Pearson Correlation Coefficient
  • Video: Applications of Vector Similarity
  • Graded Assignment: Knowledge Check: Similarity of Vectors (Part 2)
  • Video: Eigenvectors
  • Video: Eigendecomposition
  • Graded Assignment: Knowledge Check: Patterns in Matrix Data (Part 1)
  • Video: Transforming the Coordinate System
  • Reading: Dimensionality Reduction
  • Graded Assignment: Knowledge Check: Patterns in Matrix Data (Part 2)
  • Reading: Module 3 Optional Readings & Resources

Module 4: Mining Sequences

  • Video: Representing Data as Sequences
  • Video: Subsequences
  • Video: Functionalities of Sequence Data
  • Graded Assignment: Knowledge Check: Sequence Representation of Data
  • Video: Frequent Sequential Patterns
  • Graded Assignment: Knowledge Check: Sequential Patterns (Part 1)
  • Reading: Sequential Patterns in Text Data
  • Video: Ngrams and Skipgrams
  • Graded Assignment: Knowledge Check: Sequential Patterns (Part 2)
  • Reading: Introduction to Module 4 Programming Assignment: Dealing with Sequences Real-World Data
  • Video: Sequence Similarity Basics
  • Video: Edit Distance (Part 1)
  • Video: Edit Distance (Part 2)
  • Video: Shingling: Transform Sequences into Itemsets
  • Graded Assignment: Knowledge Check: Sequence Similarity
  • Role Play: Dialogue Reflection
  • Video: Course Summary
  • Reading: Module 4 Optional Readings & Resources
Grading Policy

The course grade is based on four quizzes worth 20% (5% each), and four programming assignments. The first is worth 5%, and the remaining three are worth 25% each.

Course content developed by U-M faculty and managed by the university. Faculty titles and affiliations are updated periodically.

Advanced Level

A basic understanding of linear algebra, and completing the courses of the “More Applied Data Science with Python” series in order, is recommended.

Enrollment Options

Individuals

This experience is available to individual learners on the following platforms:

U-M Community

Students, faculty, staff, and alumni of the University of Michigan get free access.

Organizations

Special pricing and tailored programming bundles available for organizational partners.

What are Coursera and edX?

Michigan Online learning experiences may be hosted on one or more learning platforms. Platform features may vary, including payment models, social communities, and learner support.

Coursera

  • Hosts online courses, series, and Teach-Outs from Michigan Online
  • Enroll and preview courses anytime
  • May earn a non-credit certificate from Coursera

edX

  • Hosts online courses and series from Michigan Online
  • Many offer a free (limited) audit option
  • May earn a non-credit certificate from edX

For more information visit the What are Coursera and edX? FAQ section

Reviews and Ratings

4.3

3 Ratings from Coursera

What Learners Are Saying

Michigan Online
For You

Sign up for a Michigan Online account to customize your experience!