Introductory statistics


Stat-UB.103 Regression and forecasting

Course description

This course examines modern statistical methods as a basis for decision making in the face of uncertainty. Topics include probability theory, discrete and continuous distributions, hypothesis testing, estimation, and statistical quality control. With the aid of computers, these statistical methods are used to analyze data. Also presented are an introduction to statistical models and their application to decision making. Topics include the simple linear regression model, inference in regression analysis, sensitivity analysis, and multiple regression analysis.

Syllabus

Full syllabus here: pdf

Assignments

  • Reading: OIS Chapter 1, up to and including Section 1.5, but you may skip 1.4.2.
  • Reading: HBR article 1 on A/B testing.
  • Reading: HBR article 2 beginning with the section on low quality data.
  • RStudio introduction video (skip to 1:45, or 5:45)
  • Homework 1 (solution)
  • Reading: OIS Chapter 2, sections 1 and 2.
  • Homework 2 (solution)
  • Reading: OIS Chapter 2, Section 4, up to 2.4.3, and 2.4.4 by Friday.

  • Practice midterm (solution)

  • Reading: econ blog on correlation, causation, and confounding

  • Reading: wikipedia on Types (especially self-selection, survivorship) and Problems of sampling bias

  • Reading: HBR on some biases common in big data (video)

Lecture notes

Reading/textbook references

These references are generally good, and some parts of them closely match the material we are covering.

Specific chapter or section references for various topics are as follows.

  • Controlled and observational studies: LSR 2, especially 2.5 onward; OIS 1.1-1.5; FPP 1-2.
  • Summaries and plots: LSR 5; OIS 1.6; MD 3; FPP 3-4,7.
  • Probability: LSR 9; OIS 2, 3.4; IDS 26, 28.1-4,6; FPP 13-15, 17.
  • Estimation: LSR 10; IDS 32-33; OIS 4.1-2; FPP 21, 23-24.
  • Intervals and hypothesis tests: MD Appendix B; LSR 10.5, 11, 13; OIS 4.2-5, 5.1-4, 6.1-2,4-6; IDS 34, 38; FPP 26-29.
  • Covariance and simple regression: MD 6; LSR 5.7, 15.1-2,4,6,8,9; OIS 7; FPP 8-12.
  • Multiple regression: MD 7; LSR 15.3,10; OIS 8.1-3.

Getting started with R