## Introduction to Machine Learning |

Instructor: Piyush Rai: (office: RM-502, email: piyush AT cse DOT iitk DOT ac DOT in)

Instructor's Office Hours: Wed 6:00-7:30pm (by appointment)

TAs: Shivam Bansal, Dhanajit Brahma, Sunabha Chatterjee, Prerit Garg, Gopichand Kotana, Neeraj Kumar, Pawan Kumar, Kranti Parida, Kawal Preet, Prem Raj, Utsav Singh, Samik Some, Vinay Verma

TA Office Hours and Contact Details: Please refer to Piazza

Class Location: L-19 (lecture hall complex)

Timings: Tue/Thur 6:00-7:30pm

- Hal Daumé III, A Course in Machine Learning (CIML), 2017 (freely available online)
- Kevin Murphy, Machine Learning: A Probabilistic Perspective (MLAPP), MIT Press, 2012
- Christopher Bishop, Pattern Recognition and Machine Learning (PRML), Springer, 2007.
- David G. Stork, Peter E. Hart, and Richard O. Duda. Pattern Classification (PC), Wiley-Blackwell, 2000
- Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning (DL), MIT Pess, 2016 (individual chapters freely available online)
- Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning (ESL), Springer, 2009 (freely available online)
- Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms (UML), Cambridge University Press, 2014
- Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar. Foundations of Machine Learning (FOML), MIT Press, 2012

Date |
Topics |
Readings/References |
Deadlines |
Slides/Notes |

July 31 | Course Logistics and Introduction to Machine Learning | ML article in Science, Some history of ML/Deep Learning/AI: [1], [2], [3], [4], Some essential maths for ML, Matrix Cookbook, Maths refresher slides | slides (print version) | |

Getting Started with ML | ||||
---|---|---|---|---|

August 2 | Warming-up to ML, and Some Simple Supervised Learners (Distance based methods) | Prototype based classification, CIML Ch 2, CIML Ch 3 | slides (print version) | |

August 7 | Decision Trees for Classification and Regression | Intro to DT, Optional: Sec 8.2-8.4 of PC, A nice visual illustration of DTs | slides (print version) | |

August 9 | Linear Models and Learning via Optimization | some notes, on equivalence of system of linear equations and linear regression (upto Section 5) (used slightly different notation) | slides (print version) | |

Basic Probabilistic Modeling | ||||

August 14 | Learning via Probabilistic Modeling | additional slides, Parameter Estimation (only up to Section 3.1), Section 5 of this tutorial, Probability section of these slides, Chapter 2 of MLAPP | slides (print version) | |

August 16 | Probabilistic Models for Supervised Learning: Discriminative Approaches | MLAPP Ch. 7.1-7.6, Ch. 8.1-8.4 (may skip details of optimization for now, and also details of Bayesian inference), additional slides on computing the posterior for probabilistic linear regression | slides (print version) | |

August 21 | Probabilistic Models for Supervised Learning: Generative Approaches | Additional slides (MLE for Gaussians), Optional Readings: PRML Section 4.2, MLAPP Section 4.1-4.2.5 | slides (print version) | |

Optimization Techniques for ML | ||||

August 23 | Basics of Convexity, Gradient Descent, Stochastic GD | Optional Readings: Chapter 2 and 3 of this book, An overview of gradient based methods | slides (print version) | |

August 28 | Subgradients, Constrained Optimization, Co-ordinate and Alternating Optimization, Second-Order Methods | UML Sec. 14.1-14.4 (may skip the advanced portions) | slides (print version) | |

Learning (Max-Margin) Hyperplanes | ||||

August 30 | Optimization (Wrap-up), and Hyperplane based Classifiers (Perceptron and SVM) | CIML Ch. 4, Sec. 7.7, Optional: FOML Sec 4.1-4.3, Basic Intro to SVM, Advanced Intro to SVM (for now, may skip parts on kernels, theoretical analysis, etc) | slides (print version) | |

Sept 4 | SVM (Contd), Multiclass and One-Class SVM | CIML Ch. 4, Sec. 7.7, Optional: FOML Sec 4.1-4.3, Basic Intro to SVM, Advanced Intro to SVM (for now, may skip parts on kernels, theoretical analysis, etc) | slides (print version) | |

Kernel Methods | ||||

Sept 6 | Making Linear Models Nonlinear via Kernel Methods | CIML Ch. 11, MLAPP Sec 14.1-14.2 | slides (print version) | |

Sept 11 | Speeding Up Kernel Methods, Intro to Unsupervised Learning | CIML 15.1, PRML Sec 9.1. Visual Intro to K-means Optional reading: Data clustering: 50 years beyond k-means | slides (print version) | |

Unsupervised Learning and Latent Variable Models | ||||

Sept 13 | K-means Clustering and Extensions | CIML 15.1, PRML Sec 9.1 | slides (print version) | |

Sept 25 | Parameter Estimation in Latent Variable Models | PRML 9.2 - 9.3.2 | slides (print version) | |

Sept 27 | Expectation Maximization | PRML 9.4 | slides (print version) | |

Oct 4 | Latent Variable Models for Dimensionality Reduction | PRML Sec 12.2 (up to 12.2.2). Also recommended Sec 12.0, 12.1 (for classical non-probabilistic PCA) | slides (print version) | |

Oct 9 | Dimensionality Reduction (Contd.) | PRML Sec 12.2 (up to 12.2.2). Also recommended Sec 12.0, 12.1 (for classical non-probabilistic PCA) | slides (print version) | |

Oct 11 | Dimensionality Reduction (Wrap-up) | Sec. 12.0, 12.1 (for classical PCA), Recommended: Sec. 12.3 (kernel PCA), A tutorial paper | slides (print version) | |

Assorted Topics | ||||

Oct 23 | Introduction to Deep Neural Networks (1) | Recommended Readings: Feedforward Nets (chapter from Deep Learning book; detailed), A shorter intro, Some nice demos | slides (print version) | |

Oct 25 | Introduction to Deep Neural Networks (2) | Recommended Readings: Convolutional neural networks, RNN and LSTM, Some nice demos, Some additional slides on autoencoders | slides (print version) | |

Oct 30 | Learning to Recommend via Matrix Factorization/Completion | Optional Readings: Matrix Factorization for Recommender Systems, Wikipedia Article on Collaborative Filtering, Deep Learning for Recommender Systems (if interested in deep learning approaches) | slides (print version) | |

Nov 1 | Model Selection, Evaluation Metrics, Learning from Imbalanced Data | to be posted soon.. | slides (print version) | |

Nov 6 | Reinforcement Learning | Recommended Readings: Intro to RL (chapter from a book), Some notes on RL | slides (print version) | |

Nov 8 | Ensemble Methods | Recommended Readings: CIML Chap 13, Intro to AdaBoost, Gradient Boosting | slides (print version) | |

Nov 13 | Bias/Variance Trade-off, Some Practical Issues, Semi-supervised and Active Learning | Recommended Readings: CIML Sec 8.1 and 8.2 (domain adaptation and covariate-shift), Brief Intro to SSL | slides (print version) | |

Nov 15 | Multitask Learning, Overview of Some Other Topics, Conclusion and Take-aways | Recommended Readings: Brief Overview of Multitask Learning, Detailed Survey on Multitask Learning | slides (print version) |

- Reference texts (locally accessible)
- scikit-learn: Machine Learning in Python: A Python based library implementing many ML algorithms.
- Machine Learning with Matlab/Octave: A MATLAB/Octave based collection of many ML algorithms (a supplement to the book "Machine Learning: A Probabilistic Perspective").
- Tensorflow and PyTorch: Both are Python based libraries implementing many ML and deep learning algorithms (and can be used to develop new ones), and have capability to use GPU acceleration (especially needed for deep learning algorithms).
- A quick Python tutorial (a nice quick reference sheet for Python), Another quick Python/NumPy Tutorial, More detailed NumPy/SciPy intro, A short (and > 10 years old) MATLAB-for-ML tutorial
- LaTeX tutorial. Note: This one is fairly detailed but pretty good; there are many shorter tutorials (e.g., this one) available as well if you just want to have a basic working knowledge of LaTeX. There are also web-based LaTeX editors (that don't require you to install LaTeX on your machine) with some cool features, such as Overleaf (newer version is "v2")