Methods for Statistical Learning
“To fix ideas…”
Text: An Introduction to Statistical Learning by James, G., Witten, D., Hastie, T. and Tibshirani, R.
Prof: Dr. Matías Salibián-Barrera
Matías is a very earnest professor who often works through helpful examples in class. He also provides supplementary class notes that are useful to review in your own time. His slides can be a bit confusing from time to time but are generally understandable.
Difficulty
This was my first statistics course. I found the material quite challenging at first because I was unfamiliar with a lot of the vocabulary and notation that was used. Also, we used a lot of quite novel results from advanced linear algebra and probability. However, I soon got used to the notation and realized that to succeed in the course one does not need to fully understand what mathematics operating in the background (just need to understand a little). The first midterm was short, free response and had a low average. The subsequent midterms and the final were multiple choice, largely, and were easier, the only difficulty often being the wording of some of the options. A lot of the material was also reviewed from CPSC 340. One of the aspects of the course that was a bit annoying was the webwork. They had a time limit and would often start by accident, or give you an incorrect mark on your first attempt. The labs were pretty reasonable, though you need to be quite quick and compare answers with your friends.
Key Concepts
Model Selection, esp Cross-Validation
Elastic Net Methods
Tree-based methods
Splines
Bagging and Boosting
Clustering Techniques
PCA
Hard Concepts
AIC Derivation: Can be pretty confusing as each formula makes different assumptions
Cross-Validation as Expectation: Make sure one understands the notation for expectation used
Linear Predictor: Some models may be linear predictors though they do not appear all that linear
Conclusion
A handy review of CPSC 340 except in R. Would have liked to have delved more into the statistics of it and how one chooses an appropriate model for a given data set.