Associate Professor
Your browser is ancient!
Upgrade to a different browser to experience this site.
This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling).
This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.
Applied Text Mining in Python introduces learners to techniques for extracting meaning from text data. You will work with text representations, regular expressions, natural language processing methods, and text classification. The course emphasizes practical skills for cleaning, analyzing, and modeling text using Python libraries, preparing you to apply text mining techniques to real-world problems.
This abbreviated syllabus description was created with the help of AI tools and reviewed by staff. The full syllabus is available to those who enroll in the course.
Module 1: Working with Text in Python
Module 2: Basic Natural Language Processing
Module 3: Classification of Text
Module 4: Topic Modeling
There are assignments in each module that account for 25% of your final grade. There is one quiz (5%) and one programming assignment (20%).
Associate Professor
Intermediate Level
Some related experience required