This course introduces commonly used machine learning algorithms such as linear and logistic regression, random forests, decision trees, neural networks, support vector machines, boosting etc. It will also offer a broad view of modelbuilding and optimization techniques that are based on probabilistic building blocks which will serve as a foundation for more advanced machine learning courses.
The first half of the course focuses on supervised learning. We begin with nearest neighbours, decision trees, and ensembles. Then we introduce parametric models, including linear regression, logistic and softmax regression, and neural networks. We then move on to unsupervised learning, focusing in particular on probabilistic models, but also principal components analysis and Kmeans. We will later consider matrix factorization, reinforcement learning, and conclude with algorithmic fairness. More details can be found in syllabus and piazza.
Final exam will be held on 4/20, at 9am EST.
Students taking the exam will be on the same zoom call during the exam (link to be shared on quercus). You will submit your work using crowdmark.
Exam will be 150 mins long, which includes the time you need to scan and upload your work to crowdmark. If you run into technical difficulties with crowdmark, you may submit your solutions to sta4142021tas@cs.toronto.edu before the exam is officially over. Late submissions will receive 2 points per late min penalty (no exceptions).
Exam covers all lectures (except week 13), it is closed book/internet. You can use two optional A4 aid sheets  doublesided. You are not responsible for the concepts introduced only in suggested readings. However, practicing those and solving the practice midterm questions would give you a significant advantage in the exam.
A representative practice final exam is here. Solutions will be posted.
Prof  Murat A. Erdogdu 

sta4142021prof@cs.toronto.edu  
Office hours  W 1012 online 
Yuehuan He, Mufan Li, Harsh Panchal, Lu Yu
Section  Room  Lecture time 

414 L0101 & 2104 L9101  online  M 1417 
414 L5101 & 2104 L6101  online  Tu 1821 
Zoom links for each lecture will be sent through quercus every week.
No required textbooks. Suggested reading will be posted after each lecture (See lectures below).
Week  Topics  Lectures  Suggested reading  Timeline 

1  Introduction to ML & Least Squares  slides  PRML 1.13 preliminaries 

2  Probabilistic Models  slides  PRML 2, 3.1  
3  Regularization and Bayesian Methods  slides  PRML 3.1, 3.3  hw1 out 
4  Linear Methods for Classification  slides  PRML 4.13  
5  Optimization in ML & Decision Theory  slides  PRML 1.5, 3.2  hw1 due & hw2 out 
6  Reading week (no class)  
7  Neural Networks & Backpropagation  slides  notes on NNs & article  hw2 due 
8  Midterm (in class)  midterm  
9  Decision Trees, Ensembles, Support Vector Machines 
slides  PRML 7.1 & 14.4  hw3 out 
10  Unsupervised learning, Latent variable models, kMeans, EM algoritm 
slides  PRML 9  
11  PCA, Autoencoders, Recommender Systems 
slides  PRML 12.1,2  hw3 due & hw4 out 
12  Reinforcement Learning  slides  RL 3, 4.1, 4.4, 6.16.5  
13  Algorithmic Fairness Final Exam Review 
slides  Zemel et al & Hardt et al  hw4 due 
Homework #  Out  Due  Materials  TA Office Hours 

Homework 1  V0  Jan 25, 00:30  Feb 08, 13:59  data  Th 12pm & F 1pm 
Homework 2  V1  Feb 6, 21:00  Feb 22, 13:59  code  Th 2pm & F 4pm 
Homework 3  V1  Mar 7, 21:00  Mar 22, 13:59  data  Th 12pm & F 9am 
Homework 4  V0  Mar 21, 23:30  Apr 5, 13:59  code  Th 1pm, F 1pm 
For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikitlearn. You have two options:
The easiest option is probably to install everything yourself on your own machine.
If you don’t already have python, install it. We recommend using Anaconda. You can also install python directly if you know how.
Use pip to install the required packages
pip install scipy numpy autograd matplotlib jupyter sklearn