- Statistical vs practical significance
- How large of a sample is enough?

- Suppose you read this headline: Diet X is associated with lower risk of cancer
- You check out the study, the null hypothesis is no assocation, the \(p\)-value is \(<0.00001\)
- Very significant result!
- But what if the risk reduction was, e.g., from 2.5% to 2.47% risk?
The result is highly statistically significant, but not very practically significant

- Note: to make formulas simple, we assume variances are known and equal 1, hence we use normal distribution instead of t-distribution
In practice, the ratio \(\mu/\sigma\) is what matters. We’ll come back to this at the end

- Let’s be more mathematical: consider an example for differences between groups
- Suppose the true difference is \(\mu_1 - \mu_2 = 0.01\)
- If the sample size is very large, the test will reject the null hypothesis
But is that really useful information? It depends on if \(0.01\) is large enough to be of any practical importance

```
group1 <- rnorm(1000000, mean = 0.01)
group2 <- rnorm(1000000, mean = 0)
t.test(group1, group2)
```

```
##
## Welch Two Sample t-test
##
## data: group1 and group2
## t = 7.1219, df = 2e+06, p-value = 1.065e-12
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.007300852 0.012845106
## sample estimates:
## mean of x mean of y
## 0.011239436 0.001166458
```

```
range <- data.frame(x = c(-2,2))
ggplot(range, aes(x)) +
stat_function(fun = dnorm, args = list(mean = 0.01, sd = 1)) +
stat_function(fun = dnorm, args = list(mean = 0, sd = 1)) +
theme_tufte() +
ggtitle("Two significantly different distributions?")
```

- Below we consider an even smaller effect of \(\mu = 0.001\)
- Plot the probability of rejecting the null at 5% significance as a function of sample size \(n\)

```
c_null <- qnorm(.95)
mu <- 0.001
powern <- function(n) {
1 - pnorm(c_null - mu*sqrt(n))
}
range <- data.frame(n = 10^c(1:7))
ggplot(range, aes(n)) +
stat_function(fun = powern) + theme_tufte() +
ylab("Power") +
ggtitle("Power as a function of sample size, mu = 0.001")
```

- Hypothesis tests are still useful if you must make a decision, e.g. A/B testing, summarizing the conclusion of a scientific study, etc
But beware: very large sample sizes might mean any test you do will be significant

Just for fun, let’s also look at the “power function” for \(n = 100\) and increasing true mean \(\mu\):

```
powermu <- function(mu) 1 - pnorm(c_null - mu*10)
range <- data.frame(mu = seq(from = 0, to = 1, length.out = 100))
ggplot(range, aes(mu)) +
stat_function(fun = powermu) + theme_tufte() +
ylab("Power") +
ggtitle("Power as a function of true mean, n = 100")
```

- Here’s the power function for the two sided alternative

```
c_null <- qnorm(.975)
powermu <- function(mu) pnorm(-c_null - mu*10) + pnorm(c_null - mu*10, lower.tail = F)
range <- data.frame(mu = seq(from = -.5, to = .5, length.out = 100))
ggplot(range, aes(mu)) +
stat_function(fun = powermu) + theme_tufte() +
ylab("Power") +
ggtitle("Power as a function of true mean (two-sided), n = 100")
```