Csc2547 / Sta4273: Topics in Statistical Learning Theory

Murat A. Erdogdu, University of Toronto, Winter 2019

Objectives

Your project goal is to make a significant contribution to understanding a machine learning related problem. An ideal project will begin with an interesting observation, later explained through theory, and end with a thorough empirical analysis. Several research directions can be found below, but the list is by no means comprehensive, and your project topic need not be drawn from it. You will review relevant literature, find interesting research directions, and either develop novel methodology, or explain an observed behavior related to a learning algorithm.

Collaboration policy

You may work on the project alone or in a group of three; the standards for a group project will be higher. We strongly encourage you to come to office hours to discuss your project ideas, progress, and difficulties with the course staff.

Evaluation

Evaluation will be based on two reports:

Progress report 15%: 1 page, to be submitted on Feb 28 stating your preliminary results, findings.
Final report 25%: 2 pages, to be submitted on Mar 28 stating your final results. You must use this latex template for your project reports.

Project Inspiration

You can go through recent papers on COLT, NeurIPS, ICML, JMLR to get project ideas. Several research directions can be found below, but the list is by no means comprehensive. If you have suggestions, let me know.

Sampling and optimization

Go to this link, and check the list there.
Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis by Raginsky, Rakhlin, and Telgarsky
Quantitative Central Limit Theorems for Discrete Stochastic Processes by Cheng, Bartlett, and Jordan
Underdamped Langevin MCMC: A non-asymptotic analysis by Cheng, Chatterji, Bartlett, and Jordan
Global Non-convex Optimization with Discretized Diffusions by Erdogdu, Mackey, and Shamir
Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent by Dalalyan

Theory of deep learning

Go to this link, and look at the list there.
A Convergence Theory for Deep Learning via Over-Parameterization by Allen-Zhu, Li, and Song
Analysis of a Two-Layer Neural Network via Displacement Convexity by Javanmard, Mondelli, and Montanari
A Mean Field View of the Landscape of Two-Layer Neural Networks by Mei, Montanari, and Nguyen
Are ResNets Provably Better than Linear Predictors? by Ohad Shamir
Entropy-SGD optimizes the prior of a PAC-Bayes bound by Dziugaite and Roy
On the Margin Theory of Feedforward Neural Networks by Wei, Lee, Liu, and Ma
Neural Tangent Kernel: Convergence and Generalization in Neural Networks by Jacot, Gabriel, and Hongler
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data by Brutzkus, Globerson, Malach, and Shalev-Shwartz
Stronger generalization bounds for deep nets via a compression approach by Arora, Ge, Neyshabur, and Zhang