Introduction to Machine Learning
CS771A
Autumn 2018

Instructor: Piyush Rai: (office: RM-502, email: piyush AT cse DOT iitk DOT ac DOT in)
Instructor's Office Hours: Wed 6:00-7:30pm (by appointment)
TAs: Shivam Bansal, Dhanajit Brahma, Sunabha Chatterjee, Prerit Garg, Gopichand Kotana, Neeraj Kumar, Pawan Kumar, Kranti Parida, Kawal Preet, Prem Raj, Utsav Singh, Samik Some, Vinay Verma
TA Office Hours and Contact Details: Please refer to Piazza
Q/A Forum: Piazza (please register)
Class Location: L-19 (lecture hall complex)
Timings: Tue/Thur 6:00-7:30pm

Background and Course Description

Machine Learning is the discipline of designing algorithms that allow machines (e.g., a computer) to learn patterns and concepts from data without being explicitly programmed. This course will be an introduction to the design (and some analysis) of Machine Learning algorithms, with a modern outlook, focusing on the recent advances, and examples of real-world applications of Machine Learning algorithms. This is supposed to be the first ("intro") course in Machine Learning. No prior exposure to Machine Learning will be assumed. At the same time, please be aware that this is NOT a course about toolkits/software/APIs used in applications of Machine Learning, but rather on the principles and foundations of Machine Learning algorithms, delving deeper to understand what goes on "under the hood", and how Machine Learning problems are formulated and solved.

Grading

There will be 4-5 homework assignments (total 30%) which may include a programming component, a mid-term (20%), a final-exam (30%), and a course project (20%)

Reference materials

There will not be any dedicated textbook for this course. In lieu of that, we will have lecture slides/notes, monographs, tutorials, and papers for the topics that will be covered in this course. Some recommended, although not required, reference books are listed below (in no particular order):

Hal Daumé III, A Course in Machine Learning (CIML), 2017 (freely available online)
Kevin Murphy, Machine Learning: A Probabilistic Perspective (MLAPP), MIT Press, 2012
Christopher Bishop, Pattern Recognition and Machine Learning (PRML), Springer, 2007.
David G. Stork, Peter E. Hart, and Richard O. Duda. Pattern Classification (PC), Wiley-Blackwell, 2000
Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning (DL), MIT Pess, 2016 (individual chapters freely available online)
Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning (ESL), Springer, 2009 (freely available online)
Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms (UML), Cambridge University Press, 2014
Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar. Foundations of Machine Learning (FOML), MIT Press, 2012

Other useful references

Here is book on essential Maths for Machine Learning (here is the PDF copy)

Here is another useful, interactive (Python notebooks) book on deep learning (it also covers many of the basic topics in machine learning): Dive into Deep Learning (authors: Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola)

Schedule

Getting Started with ML
Date	Topics	Readings/References	Deadlines	Slides/Notes
July 31	Course Logistics and Introduction to Machine Learning	ML article in Science, Some history of ML/Deep Learning/AI: [1], [2], [3], [4], Some essential maths for ML (this book is more detailed), Matrix Cookbook, Maths refresher slides		slides (print version)
August 2	Warming-up to ML, and Some Simple Supervised Learners (Distance based methods)	Prototype based classification, CIML Ch 2, CIML Ch 3		slides (print version)
August 7	Decision Trees for Classification and Regression	Intro to DT, Optional: Sec 8.2-8.4 of PC, A nice visual illustration of DTs		slides (print version)
August 9	Linear Models and Learning via Optimization	some notes, on equivalence of system of linear equations and linear regression (upto Section 5) (used slightly different notation)		slides (print version)
Basic Probabilistic Modeling
August 14	Learning via Probabilistic Modeling	additional slides, Parameter Estimation (only up to Section 3.1), Section 5 of this tutorial, Probability section of these slides, Chapter 2 of MLAPP		slides (print version)
August 16	Probabilistic Models for Supervised Learning: Discriminative Approaches	MLAPP Ch. 7.1-7.6, Ch. 8.1-8.4 (may skip details of optimization for now, and also details of Bayesian inference), additional slides on computing the posterior for probabilistic linear regression		slides (print version)
August 21	Probabilistic Models for Supervised Learning: Generative Approaches	Additional slides (MLE for Gaussians), Optional Readings: PRML Section 4.2, MLAPP Section 4.1-4.2.5		slides (print version)
More on Optimization Techniques, Hyperplane Classifiers (Perceptron, SVMs)
August 23	Basics of Convexity, Gradient Descent, Stochastic GD	Optional Readings: Chapter 2 and 3 of this book, An overview of gradient based methods		slides (print version)
August 28	Subgradients, Constrained Optimization, Co-ordinate and Alternating Optimization, Second-Order Methods	UML Sec. 14.1-14.4 (may skip the advanced portions)		slides (print version)
August 30	Optimization (Wrap-up), and Hyperplane based Classifiers (Perceptron and SVM)	CIML Ch. 4, Sec. 7.7, Optional: FOML Sec 4.1-4.3, Basic Intro to SVM, Advanced Intro to SVM (for now, may skip parts on kernels, theoretical analysis, etc)		slides (print version)
Sept 4	SVM (Contd), Multiclass and One-Class SVM	CIML Ch. 4, Sec. 7.7, Optional: FOML Sec 4.1-4.3, Basic Intro to SVM, Advanced Intro to SVM (for now, may skip parts on kernels, theoretical analysis, etc)		slides (print version)
Nonlinear Learning via Kernel Methods
Sept 6	Making Linear Models Nonlinear via Kernel Methods	CIML Ch. 11, MLAPP Sec 14.1-14.2		slides (print version)
Sept 11	Speeding Up Kernel Methods, Intro to Unsupervised Learning	CIML 15.1, PRML Sec 9.1. Visual Intro to K-means Optional reading: Data clustering: 50 years beyond k-means		slides (print version)
Unsupervised Learning and Latent Variable Models
Sept 13	K-means Clustering and Extensions	CIML 15.1, PRML Sec 9.1		slides (print version)
Sept 25	Parameter Estimation in Latent Variable Models	PRML 9.2 - 9.3.2		slides (print version)
Sept 27	Expectation Maximization	PRML 9.4		slides (print version)
Oct 4	Latent Variable Models for Dimensionality Reduction	PRML Sec 12.2 (up to 12.2.2). Also recommended Sec 12.0, 12.1 (for classical non-probabilistic PCA)		slides (print version)
Oct 9	Dimensionality Reduction (Contd.)	PRML Sec 12.2 (up to 12.2.2). Also recommended Sec 12.0, 12.1 (for classical non-probabilistic PCA)		slides (print version)
Oct 11	Dimensionality Reduction (Wrap-up)	Sec. 12.0, 12.1 (for classical PCA), Recommended: Sec. 12.3 (kernel PCA), A tutorial paper		slides (print version)
Assorted Topics
Oct 23	Introduction to Deep Neural Networks (1)	Recommended Readings: Feedforward Nets (chapter from Deep Learning book; detailed), A shorter intro, Some nice demos		slides (print version)
Oct 25	Introduction to Deep Neural Networks (2)	Recommended Readings: Convolutional neural networks, RNN and LSTM, Some nice demos, Some additional slides on autoencoders		slides (print version)
Oct 30	Learning to Recommend via Matrix Factorization/Completion	Optional Readings: Matrix Factorization for Recommender Systems, Wikipedia Article on Collaborative Filtering, Deep Learning for Recommender Systems (if interested in deep learning approaches)		slides (print version)
Nov 1	Model Selection, Evaluation Metrics, Learning from Imbalanced Data	to be posted soon..		slides (print version)
Nov 6	Reinforcement Learning	Recommended Readings: Intro to RL (chapter from a book), Some notes on RL		slides (print version)
Nov 8	Ensemble Methods	Recommended Readings: CIML Chap 13, Intro to AdaBoost, Gradient Boosting		slides (print version)
Nov 13	Bias/Variance Trade-off, Some Practical Issues, Semi-supervised and Active Learning	Recommended Readings: CIML Sec 8.1 and 8.2 (domain adaptation and covariate-shift), Brief Intro to SSL		slides (print version)
Nov 15	Multitask Learning, Overview of Some Other Topics, Conclusion and Take-aways	Recommended Readings: Brief Overview of Multitask Learning, Detailed Survey on Multitask Learning		slides (print version)

Some Recent Offerings of CS771

Autumn 2016 (Piyush Rai), Autumn 2017 (Purushottam Kar). Note: Autumn 2017 website is accessible only from within IITK.

Useful Links

Reference texts (locally accessible)
scikit-learn: Machine Learning in Python: A Python based library implementing many ML algorithms.
Machine Learning with Matlab/Octave: A MATLAB/Octave based collection of many ML algorithms (a supplement to the book "Machine Learning: A Probabilistic Perspective").
Tensorflow and PyTorch: Both are Python based libraries implementing many ML and deep learning algorithms (and can be used to develop new ones), and have capability to use GPU acceleration (especially needed for deep learning algorithms).
A quick Python tutorial (a nice quick reference sheet for Python), Another quick Python/NumPy Tutorial, More detailed NumPy/SciPy intro, A short (and > 10 years old) MATLAB-for-ML tutorial
LaTeX tutorial. Note: This one is fairly detailed but pretty good; there are many shorter tutorials (e.g., this one) available as well if you just want to have a basic working knowledge of LaTeX. There are also web-based LaTeX editors (that don't require you to install LaTeX on your machine) with some cool features, such as Overleaf (newer version is "v2")

Course Policies

Anti-cheating policy

Introduction to Machine Learning CS771A Autumn 2018