CS 777: Topics in Learning Theory

Pre-requisites

Instructor’s consent. Fluency in basic results in probability and statistics would be essential. Prior exposure to machine learning or signal processing techniques would be desirable.

Short Description

This is intended to be a course on advanced techniques used in the design and analysis of machine learning and statistical estimation algorithms. The course is divided into two broad parts, one that primarily looks at the statistical analysis of learning and estimation algorithms, and one that explores a variety of algorithm design techniques currently popular in machine learning and statistics communities. The course will involve an intense application of probabilistic models and techniques and will benefit from prior familiarity with machine learning/signal processing as a source of basic learning theoretic concepts, as well as motivation for large-scale learning and optimization.

Topics

  1. Statistical Learning Theory
    • Notion of risk, l-risk, Bayes risk, regret, plug-in predictors, regret-transfer bounds
    • Empirical risk minimization, PAC learning, uniform convergence, Glivenko-Cantelli classes
    • Capacity notions - covering numbers, VC/fat-shattering dimension, Rademacher complexity
    • Algorithmic stability, regularized risk minimization
    • Calibrated loss functions and consistency of learning methods
  2. Algorithmic Learning Theory
    • Ensemble learning - boosting, random forests
    • Non-convex optimization techniques - gradient descent, alternating minimization, EM
    • Online optimization techniques - learning with experts, online mirrored descent, explore-exploit
    • Fast sampling techniques - hit-and-run, MCMC
  3. Additional Topics (depending on interest and available time)
    • Learning with fast rates
    • PAC-Bayes bounds
    • Complex prediction tasks - ranking, structured prediction
    • Negative results - the no-free-lunch and ugly-duckling theorems, hardness of learning
    • Derivative-free optimization

References

There will be no textbook for this course. Material available from recent books, monographs, and publications will form the majority of reference material for the course.

  1. M. Anthony and P.L. Bartlett, Neural Network Learning: Theoretical Foundations, Cambridge University Press, 1999.
  2. L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition, Springer, 1996.
  3. M. Mohri, A. Rostamizadeh, and A. Talwalkar. Foundations of Machine Learning, The MIT Press, 2012.
  4. S. Shalev-Shwartz and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.
  5. V.N. Vapnik, Statistical Learning Theory, Wiley-Interscience, 1998
  6. Publications at major machine learning venues such as NIPS, ICML, COLT, ALT, AISTATS, and journals such as J. Machine Learning Res., Machine Learning J., and IEEE Trans. Inf. Theory.