Mathematics for Machine Learning
PART I — Mathematical Foundations
Module 1 — Introduction
Class 1
- Machine learning as function approximation and data fitting
- Training, prediction, and model error
- Why geometry, probability, and optimization matter
Module 2 — Linear Algebra for Data
Class 2 – Vectors and data representation in high dimensions
Class 3 – Linear maps as feature transformations
Class 4 – Rank, conditioning, and stability of models
Class 5 – Subspaces as hypothesis spaces
Class 6 – Basis change and coordinate systems
Class 7 – Null spaces and information loss
Class 8 – Geometry of linear transformations
Module 3 — Analytic Geometry of Data
Class 9 – Norms, distances, and similarity measures
Class 10 – Inner products and cosine similarity
Class 11 – Orthogonality and decorrelation
Class 12 – Projections and least-squares fitting
Class 13 – Geometry of data clouds
Module 4 — Matrix Decompositions
Class 14 – Linear operators and invariant directions
Class 15 – Eigenvalues and eigenvectors
Class 16 – Symmetric matrices and quadratic forms
Class 17 – Singular Value Decomposition (SVD)
Class 18 – SVD for optimal low-rank data representation
Module 5 — Vector Calculus for Learning
Class 19 – Multivariate functions and loss surfaces
Class 20 – Gradients as directions of steepest descent
Class 21 – Hessians, curvature, and conditioning
Class 22 – Gradient-based optimization
Module 6 — Probability for Data
Class 23 – Random variables as data generators
Class 24 – Mean, variance, and covariance
Class 25 – Multivariate Gaussian distributions
Class 26 – Probability as a model of uncertainty
PART II — Core Machine Learning
Module 7 — When Models Meet Data
Class 27 – Training, testing, and generalization
Class 28 – Loss functions and risk minimization
Class 29 – Overfitting, bias–variance, regularization
Module 8 — Linear Regression
Class 30 – Linear regression as projection
Class 31 – Least-squares and normal equations
Class 32 – Ridge regression and stability
Module 9 — Principal Component Analysis
Class 33 – Variance, covariance, and principal directions
Class 34 – Eigenvectors as principal components
Class 35 – PCA via SVD
Class 36 – Data compression and visualization
Module 10 — Gaussian Mixture Models
Class 37 – Probabilistic clustering and mixture models
Class 38 – Gaussian mixtures
Class 39 – Expectation–Maximization (EM) algorithm
Module 11 — Support Vector Machines
Class 40 – Maximum margin classification and separating hyperplanes
Class 41 – Dual formulation and support vectors
Class 42 – Kernel methods and nonlinear decision boundaries
Textbook
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020).
Mathematics for Machine Learning. Cambridge University Press.