Probabilistic Machine Learning
CS772A/CS698X
Winter 2016

Instructor: Piyush Rai: (office: KD-319, email: piyush AT cse DOT iitk DOT ac DOT in)
Office Hours: Thur 10-11am (or by appointment)
Q/A and announcements: Piazza (please register)
Class Schedule: Mon/Wed 5:00-6:30pm
Location: KD-101
TAs: Milan Someswar (milansom AT cse), Priya Saraf (priyas AT cse), Vinit Tiwari (vinitt AT iitk)
TA Office Hours: Milan (Thur 3-4pm, RM 403D), Priya (Wed 4-5pm, RM 302), Vinit (Mon 3-4pm, RM 505)

Jump to: [Schedule] [Readings] [Links+Software]

Background and Course Description

This course will look at machine learning from the viewpoint of modeling data as coming from an underlying (unknown) probability distribution. The machine learning problems then boil down to inferring the model parameters and other latent variables that define the probability model and using these in making predictions/decisions from the data. The probabilistic view is particularly useful to (1) realistically model and capture the diverse data types, characteristics, and peculiarities of the data via appropriately chosen probability distributions, and (2) encode prior assumptions about the model via prior distributions over the parameters/latent variables (also see this recent Nature article which talks about these and many other benefits of the probabilistic viewpoint). This course will introduce the basic (and some advanced) topics in probabilistic machine learning, covering (1) common parameter estimation methods for probabilistic models; (2) formulating popular machine learning problems such as regression, classification, clustering, dimensionality reduction, matrix factorization, learning from sequential data (e.g., time-series), etc., via probabilistic models; (3) Bayesian modeling and approximate Bayesian inference; (4) Deep Learning; and (5) some assorted topics. We will also, at various points during this course, look at how the probabilistic modeling paradigm naturally connects to the other dominant paradigm which is about turning machine learning problems into optimization problems, and understand the strengths/weaknesses of both these paradigms, and how they also complement each other in many ways.

Syllabus

Refer to the tentative class schedule for the list of topics.

Books

This course will take a probabilistic view to machine learning and the following book may be used as a reference: Pattern Recognition and Machine Learning (PRML) by Chris Bishop. In addition, we will have slides/notes based on the lectures, and other material available online. For reference, some other recommended books are:

- Machine Learning: A Probabilistic Perspective (MLPP) by Kevin Murphy.
- Bayesian Reasoning and Machine Learning by David Barber. Also freely available online as PDF.
- Computer Vision: Models, Learning, and Inference by Simon J.D. Prince. Also freely available online as PDF.

If you don't have any prior exposure to machine learning, the following book is highly recommended (mostly non-probabilistic view): Course in Machine Learning (by Hal Daumé III)

Grading

There will be 3 homework assignments (total 30%), a mid-term (20%), a final-exam (20%), and a course project (30%)

Schedule (Tentative)

Probabilistic Machine Learning
Date	Topic	Readings/References	Deadlines	Slides/Notes
Dec 30	Introduction to machine learning and probabilistic modeling	Review on prob/stats and linear algebra, [JM15], [Z15]		slides (4-up print)
Jan 4	Probability refresher, properties of Gaussian distribution	PRML: Chap. 1 section 1.2 (upto 1.2.2), Chap. 2 up to section 2.3.3, Appendix B, Review on prob/stats and linear algebra		slides (4-up print)
Jan 11	Basics of parameter estimation in probabilistic models	Parameter estimation for text analysis (only up to section 3), [PP08] (Matrix Cookbook)		slides (4-up print)
Jan 13	Regression: Probabilistic Linear Regression	MLPP (Murphy): Section 7.1-7.3, 7.6 (7.6.1, 7.6.2)		slides (4-up print)
Jan 18	Classification: Probabilistic Linear Classification (Logistic Regression)	MLPP (Murphy): Section 8.1-8.3.4, 8.3.6		slides (4-up print)
Jan 20	Exponential Family and Generalized Linear Models	[J03]		slides (4-up print)
Jan 25	Clustering and Density Estimation: K-means and Gaussian Mixture Models	PRML: Chapter 9 (up to Section 9.3.2)	Project proposals due	slides (4-up print)
Jan 27	Expectation Maximization	PRML: Chapter 9 (Section 9.3 and 9.4; may skip 9.3.3 and 9.3.4), Optional reading: [NH99]		slides (4-up print)
Feb 1	Expectation Maximization (Contd.)	PRML: Chapter 12 (Section 12.1 and 12.2)		slides (4-up print)
Feb 3	Probabilistic PCA and Factor Analysis, Mixtures of PPCA/Mixtures of FA	PRML: Chapter 12 (Section 12.1 and 12.2), Optional readings: [TB99], [GH97], [CG15], [B09], [IR10]		slides (4-up print)
Feb 8	Probabilistic Matrix Factorization	[SM07], [K09]		slides (4-up print)
Feb 10	Gaussian Processes for Nonlinear Regression and Nonlinear Dimensionality Reduction	MLPP (Murphy): Section 15.1-15.2, 15.5		slides (4-up print)
Approximate Bayesian Inference
Feb 22	Sampling based Inference: Monte Carlo, Rejection Sampling, Importance Sampling	PRML Chapter 11 (up to Section 11.1), Optional reading: [ADDJ03]		slides (4-up print)
Feb 24	Sampling based Inference: Markov Chain Monte Carlo, Gibbs Sampling	PRML Chapter 11 (Section 11.2 and 11.3) Optional reading: [ADDJ03]		slides (4-up print)
Feb 29	Sampling based Inference: Some Examples - GMM, Matrix Factorization, and LDA (Topic Models)	MLPP (Murphy): Section 24.2.3 and 24.2.3.1, [SM08], [GS04]		slides (4-up print)
Mar 2	Variational Bayesian (VB) Inference: Introduction and Mean-Field Approximations	PRML: Chapter 11 (up to section 10.1), Also recommended: [BKM16]		slides (4-up print)
Mar 7	Properties of VB, More Examples, and Expectation Propagation	PRML: Chapter 11 (section 10.2-10.4, 10.6, 10.7), Also recommended: [BKM16]		slides (4-up print)
Assorted Topics in PML
Mar 9	Sparse Linear Models	MLPP (Murphy): Section 13.1-13.2, 13.3 (only up to 13.3.1), 13.4.4, Optional reading: [T01]		slides (4-up print)
Mar 13	State Space Models and Linear Dynamical Systems	PRML: Chapter 13		slides (4-up print)
Mar 14	Structured Prediction: Conditional Random Fields	MLPP (Murphy): Section 19.6	Mid-sem project report due	slides (4-up print)
Mar 16	Latent Dirichlet Allocation and Topic Models	Recommended: The LDA paper		slides (4-up print)
Mar 30	Deep Probabilistic Models (1)	Optional reading: Representation Learning: A Review and New Perspectives		slides (4-up print)
Apr 4	Deep Probabilistic Models (2)	Optional reading: Representation Learning: A Review and New Perspectives		slides (4-up print)
Apr 6	Nonparametric Bayesian Models for Latent Class and Latent Feature Learning	Recommended: Indian Buffet Process: An Introduction and Review, Optional: Dirichlet Process		slides (4-up print)
Apr 11	Inference and Optimization via Message Passing	Recommended: A tutorial paper, also see Factor Graphs and the Sum-Product Algorithm		slides (4-up print)
Apr 13	Overview of other recent advances, Course Summary and Perspectives			slides (4-up print)