This course introduces commonly used machine learning algorithms such as linear and logistic regression, random forests, decision trees, neural networks, support vector machines, boosting etc. It will also offer a broad view of model-building and optimization techniques that are based on probabilistic building blocks which will serve as a foundation for more advanced machine learning courses.
The first half of the course focuses on supervised learning. We begin with nearest neighbours, decision trees, and ensembles. Then we introduce parametric models, including linear regression, logistic and softmax regression, and neural networks. We then move on to unsupervised learning, focusing in particular on probabilistic models, but also principal components analysis and K-means. We will later consider matrix factorization, reinforcement learning, and conclude with algorithmic fairness. More details can be found in syllabus and piazza.
Final exam will be held on 4/20, at 9am EDT.
Students taking the exam will be on the same zoom call during the exam (link to be shared on quercus). You will submit your work using crowdmark.
Exam will be 150 mins long, which includes the time you need to scan and upload your work to crowdmark. If you run into technical difficulties with crowdmark, you may submit your solutions to sta414-2021-tas@cs.toronto.edu before the exam is officially over. Late submissions will receive 2 points per late min penalty (no exceptions).
Exam covers all lectures (except week 13), it is closed book/internet. You can use two optional A4 aid sheets - double-sided. You are not responsible for the concepts introduced only in suggested readings. However, practicing those and solving the practice midterm questions would give you a significant advantage in the exam.
A representative practice final exam is here. Solutions will be posted.
Prof | Murat A. Erdogdu |
---|---|
sta414-2021-prof@cs.toronto.edu | |
Office hours | W 10-12 online |
Yuehuan He, Mufan Li, Harsh Panchal, Lu Yu
Section | Room | Lecture time |
---|---|---|
414 L0101 & 2104 L9101 | online | M 14-17 |
414 L5101 & 2104 L6101 | online | Tu 18-21 |
Zoom links for each lecture will be sent through quercus every week.
No required textbooks. Suggested reading will be posted after each lecture (See lectures below).
Week | Topics | Lectures | Suggested reading | Timeline |
---|---|---|---|---|
1 | Introduction to ML & Least Squares | slides | PRML 1.1-3 preliminaries |
|
2 | Probabilistic Models | slides | PRML 2, 3.1 | |
3 | Regularization and Bayesian Methods | slides | PRML 3.1, 3.3 | hw1 out |
4 | Linear Methods for Classification | slides | PRML 4.1-3 | |
5 | Optimization in ML & Decision Theory | slides | PRML 1.5, 3.2 | hw1 due & hw2 out |
6 | Reading week (no class) | |||
7 | Neural Networks & Backpropagation | slides | notes on NNs & article | hw2 due |
8 | Midterm (in class) | midterm | ||
9 | Decision Trees, Ensembles, Support Vector Machines |
slides | PRML 7.1 & 14.4 | hw3 out |
10 | Unsupervised learning, Latent variable models, k-Means, EM algoritm |
slides | PRML 9 | |
11 | PCA, Autoencoders, Recommender Systems |
slides | PRML 12.1,2 | hw3 due & hw4 out |
12 | Reinforcement Learning | slides | RL 3, 4.1, 4.4, 6.1-6.5 | |
13 | Algorithmic Fairness Final Exam Review |
slides | Zemel et al & Hardt et al | hw4 due |
Homework # | Out | Due | Materials | TA Office Hours |
---|---|---|---|---|
Homework 1 - V0 | Jan 25, 00:30 | Feb 08, 13:59 | data | Th 12pm & F 1pm |
Homework 2 - V1 | Feb 6, 21:00 | Feb 22, 13:59 | code | Th 2pm & F 4pm |
Homework 3 - V1 | Mar 7, 21:00 | Mar 22, 13:59 | data | Th 12pm & F 9am |
Homework 4 - V0 | Mar 21, 23:30 | Apr 5, 13:59 | code | Th 1pm, F 1pm |
For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikit-learn. You have two options:
The easiest option is probably to install everything yourself on your own machine.
If you don’t already have python, install it. We recommend using Anaconda. You can also install python directly if you know how.
Use pip to install the required packages
pip install scipy numpy autograd matplotlib jupyter sklearn