This is a course in data analysis. Topics covered include: Simple and multiple linear regression, causation, global and case diagnostics, robust regression, logistic regression and generalized linear models; Model selection: prediction risk, bias-variance tradeoff, risk estimation, model search, ridge regression and lasso, stepwise regression, maybe boosting; smoothing and nonparametric regression: linear smoothers, kernels, local regression, penalized regression, splines, wavelets, variance estimation, confidence bands, local likelihood, additive models; classification, including LDA, QDA, and trees. Students will practice real-world data analysis through several course projects.
This course is primarily for first-year PhD students in Statistics & Data Science. Students in other programs should check the syllabus for full prerequisite and waitlist information.
For course policies, consult the syllabus.
at the beginning of the subject line when you email me, so I can prioritize your email. Also, I prefer answering questions in office hours if possible.)