
Jeffrey Humpherys
Kummer Professor of Data Science
Department of Mathematics and Statistics
Current Course
Fall 2025: Theoretical Foundations of Machine Learning
After covering a little background in Python computing (with PyTorch) and optimization methods, we introduce the theory of probabilistic reasoning and an information theoretic universal theory of logical inference that sets up nearly every method in data science. We show how this framework can be used to formulate methods in classification, clustering, and dimensionality reduction (embedding methods). We then explore ensemble theory and show why popular methods such as random forests and extreme gradient boosting are so effective. This is a mathematically rigorous and computationally intensive course.
Prerequisites:
- Math 2222: Calculus III (equivalently Multivariable Calculus)
- Math 3108: Linear Algebra I
- Math 4211: Advanced Calculus II
- Any programming course (e.g., CS 1570), but students will be expected to learn the
essentials of the Python programming language before starting the class (see * below).
What’s Next After This? In the Spring, Foundations of Machine Learning II will cover the mathematical and computational aspects of deep learning. In particular, we will explore the following topics: dense neural networks, convolutional neural networks, recurrent neural networks, attention models, large language models, and reinforcement learning.
Office Hours
TBD
Resources
TBD