Your browser is ancient!
Upgrade to a different browser to experience this site.

Applied Information Extraction in Python

What You'll Learn

  • Develop skills to process and interpret information presented in free-text data.
  • Identify the major classes of named entity recognition (NER) in different domains such as business, politics, and healthcare.
  • Implement, with guidance, state-of-the-art machine learning techniques for NER.
  • Compare, contract, and select between multiple machine learning and deep learning approaches for NER.
4 Modules
24 Hours
6 hrs per module (approx.)

About Applied Information Extraction in Python

In “Applied Information Extraction in Python,” you will learn how to extract useful information from free-text data, which is a type of string data created when people type. Examples of free-text data include names of people or organizations, location information such as cities and zip codes, or other elements like stock prices or clinical diagnoses. Free-text data is found everywhere, from magazine articles to social media posts, and can be complex to analyze.

In this course, you’ll use applied machine learning and text-mining techniques to analyze free-text data. You will learn how to identify named entities and tag them with appropriate types of classifications, using real-world data from business, politics, and healthcare. You’ll develop multiple approaches to recognize and extract named entities and attributes of interest from free-text data, ranging from regular expressions to neural network models. Finally, you’ll explore Transformer models such as ChatGPT and Large Language Models to extract information from large datasets.

This is the final course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the following courses from the Applied Data Science with Python Specialization: Introduction to Data Science in Python, Applied Machine Learning in Python, and Applied Text Mining in Python.

Skills You'll Gain

  • Data Manipulation
  • Information Extraction
  • Machine Learning
  • Python For Data Analysis
  • Python (Programming Language)
  • Text Extraction
  • Text Processing

What You'll Earn

Certificate of Completion:
Certificates of completion acknowledge knowledge acquired upon completion of a non-credit course or program.
Experience Type
100% Online
Format
Self-Paced
Subject
  • Data Science
  • Technology
Platform
Coursera
Welcome Message

Welcome to Applied Information Extraction in Python, part of the More Applied Data Science with Python specialization. This course explores techniques for extracting structured information from text using rule-based methods, machine learning, neural networks, and transformer models. You will gain hands-on experience building information extraction pipelines across diverse application domains.


This abbreviated syllabus description was created with the help of AI tools and reviewed by staff. The full syllabus is available to those who enroll in the course.

Course Schedule

Module 1: Information Extraction

  • Video: Welcome to Information Extraction
  • Reading: MADSwPy Certificate Roadmap
  • Reading: Course Syllabus
  • Reading: Introduction to Jupyter Notebook
  • Discussion Prompt: Meet Other Learners
  • Reading: Help Us Learn About You
  • Video: What is Information Extraction?
  • Video: Information Extraction in Different Domains
  • Graded Assignment: Knowledge Check: Introduction to Information Extraction
  • Video: Extracting Formatted Information
  • Reading: Regular Expressions in Detail
  • Video: Lookup Based Extraction
  • Graded Assignment: Knowledge Check: Rule-Based Approaches to Information Extraction
  • Video: Demo: Using Regular Expressions & Examining Output
  • Ungraded Lab: Jupyter Notebook Practice on Basic NLP and Rule-Based Extraction
  • Video: Assignment 1 Introduction: Formatting & Normalizing Data with Regular Expressions
  • Graded: Build an Information Extraction Pipeline for Template/List-Based Fields
  • Graded: Module 1 Assignment

Module 2: Named Entity Recognition (NER)

  • Video: What is Named Entity Recognition (NER)?
  • Graded Assignment: Knowledge Check: Named Entities and Named Entity Recognition
  • Reading: BIO Encoding for Named Entity Labels
  • Reading: BILOU Encoding for Named Entity Labels
  • Reading: Machine Learning Fundamentals: How Machines Learn to Label Named Entities
  • Video: NER as a Sequence Classification Task
  • Graded Assignment: Knowledge Check: Setting up NER as a Machine Learning Task
  • Reading: Markov Chain and Hidden Markov Models
  • Video: Fundamentals of Markov Chain Models
  • Video: Hidden Markov Models (HMMs)
  • Reading: Training Hidden Markov Models: How HMMs Learn to Assign Labels
  • Reading: The Math Behind HMMs: How Probabilities Power Sequence Labeling
  • Video: Conditional Random Fields (CRFs)
  • Graded Assignment: Knowledge Check: Hidden Markov Models (HMMs)
  • Video: Demo: CRF Model Training
  • Ungraded Lab: Jupyter Notebook Practice on Training CRFs
  • Video: Assignment 2 Introduction: Implementing a CRF Model
  • Graded: Build an Information Extraction Pipeline for CRF Based Extraction
  • Graded: Module 2 Assignment

Module 3: Neural Network Models

  • Video: Introduction to Deep Learning
  • Reading: Understanding Deep Learning: A Shift From Rules to Representation
  • Graded Assignment: Knowledge Check: What Is Deep Learning?
  • Video: Neural Network Models
  • Reading: Activation Functions: How Deep Learning Models Make Decisions
  • Graded Assignment: Knowledge Check: What Are Neural Network Models?
  • Video: Deep Neural Network Models
  • Reading: Understanding Deep Neural Network Models: How Depth Enables Learning at Scale
  • Graded Assignment: Knowledge Check: Deeper Neural Networks
  • Reading: Building an Information Extraction Pipeline with BiLSTMs and CRFs
  • Ungraded Lab: Jupyter Notebook Practice on Training LSTMs
  • Video: Demo: Configuring the Bi-Directional LSTM
  • Video: Assignment 3 Introduction: Building an Information Extraction Pipeline with BiLSTMs and CRFs
  • Graded: Build an Information Extraction Pipeline using Deep Neural Networks
  • Graded: Module 3 Assignment

Module 4: Transformers Transform Information Extraction

  • Video: Language Models (LMs)
  • Video: Large Language Models (LLMs)
  • Graded Assignment: Knowledge Check: Language Models
  • Video: Transformers
  • Reading: Recent Advances in GPTs
  • Graded Assignment: Knowledge Check: What Are Transformers?
  • Reading: Building an Information Extraction Pipeline with Transformers and LLMs
  • Role Play: Design an Information Extraction System for Sports News Reports
  • Video: Assignment 4 Introduction: Build an Information Extraction Pipeline using Transformers
  • Video: Course Wrap-Up
  • Reading: Continue Your Journey and Earn a Master of Applied Data Science Degree Online
  • Reading: Course Post-Survey
  • Graded: Build an Information Extraction Pipeline using Transformers
  • Graded: Module 4 Assignment
Grading Policy

Learners must complete all graded assignments. There is an assessment in each module worth 5% of your total grade and a programming assignment in each module worth 20% of your final grade.

Course content developed by U-M faculty and managed by the university. Faculty titles and affiliations are updated periodically.

Advanced Level

Learners will benefit from having exposure to machine learning in Python, as well as completing the courses of this series in sequential order.

Enrollment Options

Individuals

This experience is available to individual learners on the following platforms:

U-M Community

Students, faculty, staff, and alumni of the University of Michigan get free access.

Organizations

Special pricing and tailored programming bundles available for organizational partners.

What are Coursera and edX?

Michigan Online learning experiences may be hosted on one or more learning platforms. Platform features may vary, including payment models, social communities, and learner support.

Coursera

  • Hosts online courses, series, and Teach-Outs from Michigan Online
  • Enroll and preview courses anytime
  • May earn a non-credit certificate from Coursera

edX

  • Hosts online courses and series from Michigan Online
  • Many offer a free (limited) audit option
  • May earn a non-credit certificate from edX

For more information visit the What are Coursera and edX? FAQ section

Michigan Online
For You

Sign up for a Michigan Online account to customize your experience!