#### Review

• Bayes’ theorem
• Switching between $$P(A | B)$$ and $$P(B | A)$$
• $$P(B | A) = P(A | B)P(B)/P(A)$$

• Today: probabilities related to sums, Binomial random variables, expectation(?)

##### Probabilities for sums
• Suppose we roll two 6-sided die, adding the two results, and are interested in calculating probabilities for different values of this sum
• Let’s call this value $$X$$
• Each dice is between 1 and 6, so the $$X$$ is between 2 and 12
• $$P(X = 2) = P(D_1 = 1 \text{ and } D_2 = 1)$$what’s the easy way to calculate this?
• Multiplication rule – the die are independent! $$P(D_1 = 1)P(D_2 = 2)$$
• $$P(X = 2) = (1/6)^2$$ = 1/36
• What is $$P(X = 12)$$? (same)
• How about $$P(X = 3)$$? This can happen if $$D_1 = 1$$ and $$D_2 = 2$$ or if $$D_1 = 2$$ and $$D_2 = 1$$
• Two ways it can happen, each of those has probability 1/36… what’s the easy way to calculate this?
• Addition rule – disjoint events! $$P(D_1 = 1)P(D_2 = 2) + P(D_1 = 2)P(D_2 = 1)$$
• $$P(X = 3) = (1/6)^2 + (1/6)^2$$
• In general, to find $$P(X = x)$$ we count the number of ways two sides of the die can add up to $$x$$ and multiply that number by $$(1/6)^2$$
• Which outcome(s) are the most likely ones? The middle: 7 (halfway between 2 and 12)
• e.g. for 7 (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)
• $$P(X = 7) = 6(1/6)^2 = 1/6$$

• Suppose we toss 4 coins and add up the number of heads, call this $$X$$
• $$P(X = 4) = P(X = 0) = (1/2)^4$$
• For $$X = 2$$, could be (0,0,1,1), (0,1,0,1), (0,1,1,0), (1,0,0,1), (1,0,1,0), (1,1,0,0)
• $$P(X = 2) = (1/2)^4 + (1/2)^4 + (1/2)^4 + (1/2)^4 + (1/2)^4 + (1/2)^4 = 6(1/2)^4 = 3/8$$
• What if we toss a coin 60 times and count the number of heads?
• e.g. for 0, every one must be tails, probability $$(1/2)^{60}$$. Similar for 60
• e.g. for 1, pick 1 out of 60 tosses to come up heads–this can happen in 60 ways. So $$60(1/2)^{60}$$
• What about 30?… math to the rescue?
• Yes! There is an easy formula. Answer: over $$10^{17}$$ ways!
• About 10.2% for 30, 9.9% for 31 and 29, 8.9% for 32 and 28.
• These are really large compared to $$1/2^{60} \approx 0.$$(16 zeros)87%

• Many calculations like this are made possible by studying random outcomes that are numbers
• i.e. random variables

##### Random variables, continued
• Probability density function $$p_X(x) = P(X = x)$$
• Cumulative distribution function $$F_X(x) = P(X \leq x)$$
• Bernoulli: Ber($$p$$), $$D = \{ 0, 1 \}$$ $$P(X = 1) = p$$. Standard: $$p = 1/2$$.
• Binomial: Bin($$n, p$$) number of “successes” in $$n$$ independent “trials” with each having success probability $$p$$
• If $$p = 1/2$$, like tossing a coin $$n$$ times and counting the tails
• Otherwise need a “biased” coin–one with probability $$p \neq 1/2$$ of landing on heads

• Independent random variables $$X, Y$$ are indepedent if $$P(X = x \text{ and } Y = y) = P(X = x)P(Y = y) = p_X(x)p_Y(y)$$
• $$P(X = x \text{ and } Y = y) = p_{X,Y}(x,y)$$ is called the joint distribution of $$X$$ and $$Y$$
• If they are independent, the joint distribution factors into the product of their individual distributions $$p_{X,Y}(x,y) = p_X(x)p_Y(y)$$

• Suppose we have $$n$$ independent Ber($$p$$) r.v.s and we add them
• What is the distribution of $$X_1 + X_2 + \cdots + X_n$$?
• Each one is 0 or 1, there are $$n$$ of them, they are independent…
• The sum is just the count of how many of them equal 1
• This is Bin($$n, p$$)!

##### Expected values of random variables
• A random variable $$X$$ can potentially equal many possible values, just like how a variable in a dataset might take many different values
• Can we summarize a random variable in a similar way to how we summarize data?
• What about an average value, like the mean?
• For random variables we call this the expected value
• $$E[X] = \sum_{x \in D} x p_X(x)$$
• Weighted sum of all the possible values, with weight given by probability of that value
• e.g. for $$X \sim Ber($$p$$)$$, what is $$E[X]$$?
• $$0\cdot P(X = 0) + 1 \cdot P(X = 1) = 0(1-p) + 1(p) = p$$
##### Generating random variables in R

Uniform integers between 0 and 10

# One sample:
sample(0:10, 1)
## [1] 7
# 10 samples:
sample(0:10, 10, replace=T)
##  [1]  7  7  3  3  2  4 10  1  7  2
# Plotting a probability histogram of 1000 samples
df <- data.frame(x = sample(0:10, 1000, replace=T))
ggplot(df, aes(x)) + stat_count() + theme_tufte()

Uniform continuous numbers between 0 and 1

# One sample:
runif(1)
## [1] 0.5788398
# 10 samples:
runif(10)
##  [1] 0.2869595 0.8481076 0.2533678 0.6708428 0.5025239 0.9634737 0.1871624
##  [8] 0.9848846 0.5392730 0.1547897
# Plotting 1000 samples
df <- data.frame(x = runif(1000))
ggplot(df, aes(x)) + geom_histogram(bins = 40) + theme_tufte()

Binomial with 10 trials, success prob 1/2

# One sample:
rbinom(1, 10, .5)
## [1] 8
# 10 samples:
rbinom(10, 10, .5)
##  [1] 7 5 6 4 6 4 6 7 7 5
# Plotting 1000 samples
df <- data.frame(x = rbinom(1000, 10, .5))
ggplot(df, aes(x)) + stat_count() + theme_tufte()