This is a course in data analysis. Topics covered include: Simple and multiple linear regression, causation, global and case diagnostics, robust regression, logistic regression and generalized linear models; Model selection: prediction risk, bias-variance tradeoff, risk estimation, model search, ridge regression and lasso, stepwise regression, maybe boosting; smoothing and nonparametric regression: linear smoothers, kernels, local regression, penalized regression, splines, wavelets, variance estimation, confidence bands, local likelihood, additive models; classification, including LDA, QDA, and trees. Students will practice real-world data analysis through several course projects.
This course is primarily for first-year PhD students in Statistics & Data Science. Students in other programs should check the syllabus for full prerequisite and waitlist information.
For course policies, consult the syllabus.