**Methods for Statistical Learning
**

*“To fix ideas…”*

**Text: **An Introduction to Statistical Learning by James, G., Witten, D., Hastie, T. and Tibshirani, R.

**Prof:** Dr. Matías Salibián-Barrera

Matías is a very earnest professor who often works through helpful examples in class. He also provides supplementary class notes that are useful to review in your own time. His slides can be a bit confusing from time to time but are generally understandable.

**Difficulty**

This was my first statistics course. I found the material quite challenging at first because I was unfamiliar with a lot of the vocabulary and notation that was used. Also, we used a lot of quite novel results from advanced linear algebra and probability. However, I soon got used to the notation and realized that to succeed in the course one does not need to fully understand what mathematics operating in the background (just need to understand a little). The first midterm was short, free response and had a low average. The subsequent midterms and the final were multiple choice, largely, and were easier, the only difficulty often being the wording of some of the options. A lot of the material was also reviewed from CPSC 340. One of the aspects of the course that was a bit annoying was the webwork. They had a time limit and would often start by accident, or give you an incorrect mark on your first attempt. The labs were pretty reasonable, though you need to be quite quick and compare answers with your friends.

**Key Concepts**

Model Selection, esp Cross-Validation

Elastic Net Methods

Tree-based methods

Splines

Bagging and Boosting

Clustering Techniques

PCA

**Hard Concepts**

AIC Derivation: Can be pretty confusing as each formula makes different assumptions

Cross-Validation as Expectation: Make sure one understands the notation for expectation used

Linear Predictor: Some models may be linear predictors though they do not appear all that linear

**Conclusion**

A handy review of CPSC 340 except in R. Would have liked to have delved more into the statistics of it and how one chooses an appropriate model for a given data set.