Machine Learning
CS771A
Autumn 2016

Instructor: Piyush Rai: (office: KD-319, email: piyush AT cse DOT iitk DOT ac DOT in)
Office Hours: Tuesday 12-1pm (or by appointment)
Q/A Forum: Piazza (please register)
Class Location: L-16 (lecture hall complex)
Timings: WF 6:00-7:30pm

Background and Course Description

Machine Learning is the discipline of designing algorithms that allow machines (e.g., a computer) to learn patterns and concepts from data without being explicitly programmed. This course will be an introduction to the design (and some analysis) of Machine Learning algorithms, with a modern outlook, focusing on the recent advances, and examples of real-world applications of Machine Learning algorithms. This is supposed to be the first ("intro") course in Machine Learning. No prior exposure to Machine Learning will be assumed. At the same time, please be aware that this is NOT a course about toolkits/software/APIs used in applications of Machine Learning, but rather on the principles and foundations of Machine Learning algorithms, delving deeper to understand what goes on "under the hood", and how Machine Learning problems are formulated and solved.

Pre-requisites

MSO201A/equivalent, CS210/ESO211/ESO207A; Ability to program in MATLAB/Octave. In some cases, pre-requisites may be waived (will need instructor's consent).

Grading

There will be 4 homework assignments (total 40%) which may include a programming component, a mid-term (20%), a final-exam (20%), and a course project (20%)

Reference materials

There will not be any dedicated textbook for this course. In lieu of that, we will have lecture slides/notes, monographs, tutorials, and papers for the topics that will be covered in this course. Some recommended, although not required, reference books are listed below (in no particular order):

Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning, Springer, 2009 (freely available online)
Hal Daumé III, A Course in Machine Learning, 2015 (in preparation; most chapters freely available online)
Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012
Christopher Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014

Schedule (Tentative)

Supervised Learning
Date	Topics	Readings/References	Deadlines	Slides/Notes
July 28	Course Logistics and Introduction to Machine Learning	Linear Algebra review, Probability review, Matrix Cookbook, MATLAB review, [JM15], [LBH15]		slides
Aug 3	Learning by Computing Distances: Distance from Means and Nearest Neighbors	Distance from Means, CIML Chapter 2		slides
Aug 5	Learning by Asking Questions: Decision Tree based Classification and Regression	Book Chapter, Info Theory notes DT - visual illustration		slides
Aug 10	Learning as Optimization, Linear Regression	Optional: Some notes, Some useful resources on optimization for ML		slides
Aug 12	Learning via Probabilistic Modeling, Probabilistic Linear Regression	Murphy (MLAPP): Chapter 7 (sections 7.1-7.5)		slides
Aug 17	Learning via Probabilistic Modeling: Logistic and Softmax Regression	Murphy (MLAPP): Chapter 8 (sections 8.1-8.3)		slides
Aug 19	Online Learning via Stochastic Optimization, Perceptron	Murphy (MLAPP): Chapter 8 (section 8.5)		slides
Aug 24	Learning Maximum-Margin Hyperplanes: Support Vector Machines	Intro to SVM, Wikipedia Intro to SVM, Optional: Advanced Intro to SVM, SVM Solvers		slides
Aug 26	Nonlinear Learning with Kernels	CIML Chapter 9 (section 9.1 and 9.4), Murphy (MLAPP): Chapter 14 (up to section 14.4.3)		slides
Unsupervised Learning
Aug 31	Data Clustering, K-means and Kernel K-means	Bishop (PRML): Section 9.1. Optional reading: Data clustering: 50 years beyond k-means	HW 1 Due	slides
Sept 2	Linear Dimensionality Reduction: Principal Component Analysis	Bishop (PRML): Section 12.1. Optional reading: PCA tutorial paper		slides
Sept 7	PCA (Wrap-up) and Nonlinear Dimensionality Reduction via Kernel PCA	Optional reading: Kernel PCA		slides
Sept 21	Matrix Factorization and Matrix Completion	Optional Reading: Matrix Factorization for Recommender Systems, Scalable MF		slides
Sept 23	Introduction to Generative Models			slides
Sept 26	Generative Models for Clustering: GMM and Intro to EM	Bishop (PRML): Section 9.2 and 9.3 (up to 9.3.2)		slides (notes)
Sept 28	Expectation Maximization and Generative Models for Dim. Reduction	Bishop (PRML): Section 9.3 (up to 9.3.2) and 9.4		slides
Oct 5	Generative Models for Dim. Reduction: Probabilistic PCA and Factor Analysis	Bishop (PRML): Section 12.2 (up to 12.2.2). Optional reading: Mixtures of PPCA	HW 2 Due	slides
Assorted Topics
Oct 19	Practical Issues: Model/Feature Selection, Evaluating and Debugging ML Algorithms	On Evaluation and Model Selection		slides
Oct 24	Introduction to Learning Theory	Optional (but recommended) Mitchell ML Chapter 7 (sections 7.1-7.3.1, section 7.4 (up to 7.4.2))		slides
Oct 26	Ensemble Methods: Bagging and Boosting	CIML Chapter 11, Optional: Brief Intro to Boosting, Explaining AdaBoost		slides
Oct 28	Semi-supervised Learning	Reading: Brief SSL Intro, Optional: A (somewhat old but recommended) survey on SSL		slides
Nov 2	Deep Learning (1): Feedforward Neural Nets and CNN	Optional Readings: Feedforward Neural Networks, Convolutional Neural Nets	HW 3 Due	slides
Nov 4	Deep Learning (2): Models for Sequence Data (RNN and LSTM) and Autoencoders	Optional Readings: RNN and LSTM, Understanding LSTMs, RNN and LSTM Review		slides
Nov 5	Learning from Imbalanced Data			slides
Nov 9	Online Learning (Adversarial Model and Experts)	Optional Reading: Foundations of ML (Chapter 7)		slides
Nov 11	Survey of Other Topics and Conclusions			slides

Useful Links

- Machine Learning Summer Schools
- Scikit-Learn: Machine Learning in Python
- Awesome Machine Learning (a comprehensive list of various Machine Learning libraries and softwares)

Course Policies

Anti-cheating policy

Machine Learning CS771A Autumn 2016