Outline

Interpreting p-values

movies <- bechdel[complete.cases(bechdel),]
movies$return <- movies$intgross_2013/movies$budget_2013
wilcox.test(return ~ binary, data = movies)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  return by binary
## W = 299610, p-value = 0.0412
## alternative hypothesis: true location shift is not equal to 0
counts <- table(survey$Smoke, survey$Sex)
counts
##        
##         Female Male
##   Heavy      5    6
##   Never     99   89
##   Occas      9   10
##   Regul      5   12
chisq.test(counts)
## 
##  Pearson's Chi-squared test
## 
## data:  counts
## X-squared = 3.5536, df = 3, p-value = 0.3139

P-hacking

Multiple testing

\[ P(\text{at least one false positive out of } m \text{ tests}) = 1 - P(\text{no false positives out of } m) \] We can do this using independence! \[ P(\text{no false positives out of } m) = (0.95)^m \]

m <- 1:90
FWER <- data.frame(m=m, FWER = 1 - 0.95^m)
ggplot(FWER, aes(x = m, y = FWER)) + geom_point() + theme_tufte()

Confidence intervals are better