This course covers several topics in machine learning theory. We will try to answer questions like:

- What is the convergence rate of a particular learning algorithm?
- How much data do you need to get good prediction results?
- What is the performance of your algorithm on test data?

Topics may include: Asymptotic statistics, Uniform Convergence, Generalization, Kernel Methods, Online Learning, Sampling. More details can be found in syllabus.

This class requires a good informal knowledge of probability theory, linear algebra, real analysis (at least Masters level). Homework 0 is a good way to check your background.

- Final project reports are due on 04/09 11:59pm.
- Lectures on 03/19, 03/26, 04/02 will be held online. Instructions will be sent over quercus.
- Several project directions can be found here. Proposals due Jan 30 in class.

- Email: csc2532prof@cs.toronto.edu
- Office hours: Th 16:15-17:15 at Pratt 286B

- Email: csc2532ta@cs.toronto.edu

Section | Room | Lecture time |
---|---|---|

L0101 | SS 2108 | Th 14-16 |

No required textbooks. Suggested reading will be posted after each lecture (See lectures below).

- (ESL) Hastie, Tibshirani, Friedman (2009) The Elements of Statistical Learning
- (ITIL) MacKay (2003) Information Theory, Inference, and Learning Algorithms
- (UML) Shalev-Shwartz, Ben-David (2014) Understanding Machine Learning: From Theory to Algorithms
- (HDP) Vershynin (2018) High Dimentional Probability

Week | Day | Lectures | Timeline |
---|---|---|---|

1 | 1/09 | Introduction & Warm-up: Gaussian Mean Estimation | syllabus |

2 | 1/16 | Exponential Families and Information Inequality | - |

3 | 1/23 | Asymptotic statistics | hw1 out |

4 | 1/30 | Uniform convergence & Generalization | project proposal due |

5 | 2/06 | Covering with epsilon-nets | hw1 due & hw2 out |

6 | 2/13 | Rademacher complexity: Definition | - |

7 | 2/20 | Rademacher complexity: Properties & Applications | hw2 due & hw3 out |

8 | 2/27 | Combinatorial Measures of Complexity | project progress report due |

9 | 3/05 | Chaining and Dudley’s theorem | hw 3 due |

10 | 3/12 | Midterm (in class) | midterm |

11 | 3/19 | PAC-Bayes bounds & Stability | - |

12 | 3/26 | Kernel Methods: Basics | - |

13 | 4/02 | Kernel Methods: Properties & Applications |

Homework # | Out | Due | TA Office Hours |
---|---|---|---|

Homework 0 - V0 | 1/9 | - | - |

Homework 1 - V0 | 1/24 | 2/6 in class | Tue 10-11am, Wed 1-2pm @BA5256 |

Homework 2 - V0 | 2/10 | 2/23 via email | Tue 10-11am, Wed 3-4pm @Pratt 286B |

Homework 3 - V0 | 2/26 | 3/05 via email | Tue 9:30-10:30am, Wed 3-4pm @BA5256 |

Latex template can be found here.

Your project goal is to make a significant contribution to understanding a machine learning related problem. An ideal project will begin with an interesting observation, later explained through theory, and end with a thorough empirical analysis. Several research directions can be found below, but the list is by no means comprehensive, and your project topic need not be drawn from it. You will review relevant literature, find interesting research directions, and either develop novel methodology, or explain an observed behavior related to a learning algorithm.

- An example project from last year: Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond, by Li, Wu, Mackey, Erdogdu

**Project Inspiration:**
You can go through recent papers on COLT, NeurIPS, ICML, ICLR, JMLR to get project ideas. Several research directions can be found here, but the list is by no means comprehensive. If you have suggestions, let me know.

Latex template for reports can be found here.

For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikit-learn.

- If you don’t already have python, install it. We recommend using Anaconda. You can also install python directly if you know how.
- Use pip to install the required packages
`pip install scipy numpy matplotlib jupyter sklearn`