Here are my original hand-written notes on this topic.
Probability function
- Let $U$ (the universe) be some set
- A probability function $\Pr\colon 2^U\to [0,1]$ has the following properties:
- $\Pr\{A\} \ge 0$ for all $A\subseteq U$
- $\Pr\{U\} = 1$
- $\Pr\{A \cup B\} = \Pr\{A\} + \Pr\{B\}$ whenever $A$ and $B$ are disjoint
In general $\Pr\{A\cup B\} = \Pr\{A\} + \Pr\{B\} - \Pr\{A\cap B\}$ because the first two terms count $\Pr\{A\cap B\}$ twice
this is called the inclusion-exclusion formula. We immediately get Boole's Inequality: \[ \Pr\{A \cup B\} \le \Pr\{A\} + \Pr\{B\} \enspace. \] Boole's Inequality is also called the union bound.
Some terminology
- $A$ and $B$ are called events
- If $|A|=1$, then it is called an elementary event
- $\Pr\{A\cup B\}$ is also called $\Pr\{\text{$A$ or $B$}\}$
- $\Pr\{A\cap B\}$ is also called $\Pr\{\text{$A$ and $B$}\}$
Examples
- Coin toss: $U=\{H, T\}$ and $\Pr\{H\} = \Pr\{T\} = 1/2$
- Die toss (6-sided): $U=\{1,2,3,4,5,6\}$ and $\Pr\{i\} = 1/6$ for each $i\in\{1,\ldots,6\}$.
- Tossing 2 coins (one silver and one gold): $U=\{TT, TH, HT, HH\}$ and $\Pr\{TT\}=\Pr\{TH\}=\Pr\{HT\}=\Pr\{HH\}=1/4$
- Rolling two dices (one blue and one red): $U=\{(i,j):i,j\in\{1,\ldots,6\}\}$ and $\Pr\{(i,j)\} = 1/36$ for all $i,j\in\{1,\ldots,6\}$
- Rolling two dice and taking their sum:
\[ U = \{2,3,4,5,6,7,8,9,10,11,12\} \]
- $\Pr\{2\} = \Pr\{(1,1)\} = 1/36$.
- $\Pr\{3\} = \Pr\{(1,2)\} + \Pr\{(2,1)\} = 2/36$
- $\Pr\{4\} = \Pr\{(1,3)\} + \Pr\{(3,1)\} + \Pr\{(2,2)\} = 3/36$
- $\Pr\{5\} = \Pr\{(1,4)\} + \Pr\{(4,1)\} + \Pr\{(2,3)\} + \Pr\{(3,2)\} = 4/36$
- $\Pr\{6\} = \Pr\{(1,5)\} + \Pr\{(5,1)\} + \Pr\{(2,4)\} + \Pr\{(4,2)\} + \Pr\{(3,3)\} = 5/36$
- $\Pr\{7\} = \Pr\{1,6\} + \Pr\{(6,1)\} + \cdots + \Pr\{(3,4)\} + \Pr\{(4,3)\} = 6/36 = 1/36$
- $\Pr\{8\}=5/36$
- $\Pr\{9\}=4/36$
- $\Pr\{10\}=3/36$
- $\Pr\{11\}=2/36$
- $\Pr\{12\}=1/36$
Calculating some probabilities
- Roll a die, what is the probability of getting an even number? \[\Pr\{2,4,6\} = \Pr\{2\} + \Pr\{4\} + \Pr\{6\} = 3/6 = 1/2\] This uses Property 3 of probability spaces. For countable $U$, we can always use $\Pr\{A\}=\sum_{x\in A} \Pr\{x\}$
- Pick a number, $x$, uniformly at random from $\{1,\ldots,1000\}$ (so $\Pr\{i\}=1/1000$ for all $i\in\{1,\ldots,1000\}$). What is the probabilty that $x$ is divisible by 2 or 3? Using the notation "$a\mid x$" to mean "$x$ is divisible by $a$", we have: \begin{align*} \Pr\{\text{$2\mid x$ or $3\mid x$}\} & = \Pr\{2\mid x\} + \Pr\{3\mid x\} - \Pr\{6\mid x\} \\ & = \frac{500}{1000}+\frac{333}{1000}-\frac{166}{1000} \\ & =\frac{667}{1000} \end{align*} Here we used the inclusion-exclusion formula.
- Roll two dice, what is the probability that their sum is even?
\begin{align*}
\Pr\{\text{$i+j$ is even}\}
& = \Pr\{\text{$i$ and $j$ are both even}\} + \Pr\{\text{$i$ and $j$ are both odd}\} \\ & = \frac{3\times 3}{36} + \frac{3\times 3}{36} \\ & = \frac{1}{2}
\end{align*} This used Property 3 again, since a number can't be both even and odd. - Roll two dice, what is the probability that their sum is odd? \[\Pr\{2\not\mid (i+j)\} = 1-\Pr\{2\mid (i+j)\} = 1-1/2 = 1/2 \] This is a very common trick for computing the probabilities of complementary events. If we know $\Pr\{A\}$, then we know that $\Pr\{\overline{A}\} = \Pr\{U\setminus A\} = 1-\Pr\{A\}$.
Expectation and Random Variables
- Let $X\colon U\to \R$ map each element of $U$ to a real number
- The expected value of $X$ is \[ \sum_{a\in U}\Pr\{a\}\times X(a) \]
- Example: Roll a die. Define $X(i)=i$ and we get
\[ \E[X] = \frac{1}{6}\times 1
+ \frac{1}{6}\times 2
+ \frac{1}{6}\times 3 + \frac{1}{6}\times 4 + \frac{1}{6}\times 5 + \frac{1}{6}\times 6 = 3.5 \] The expected value is $3.5$. - Example: Sum of two dice. Define $X(i,j)=i+j$. then \begin{align*} \E[X] & = 2\times\frac{1}{36} +3\times\frac{2}{36} +4\times\frac{3}{36} +5\times\frac{4}{36} +6\times\frac{5}{36} +7\times\frac{6}{36} \\ &\quad{}+8\times\frac{5}{36} +9\times\frac{4}{36} +10\times\frac{3}{36} +11\times\frac{2}{36} +12\times\frac{1}{36} \\ & = \cdots \\ & = 7 \end{align*}
Linearity of Expectation
- For any random variables $X$ and $Y$, \[ \E[X+Y] = \E[X] + \E[Y] \]
- This is so useful, it has a name: linearity of expectation
- Example: Sum of two dice, $D_1$ and $D_2$, (again) \[ \E[D_1+D_2] = \E[D_1] + \E[D_2] = 3.5 + 3.5 = 7 \]
- Example: Toss $n$ coins, what is the expected number of heads?
Define
\[ X_i = \begin{cases}
1 & \text{if $i$th coin comes up heads}\\
0 & \text{otherwise}
\end{cases} \] Then \begin{align*} \E[X_1+\cdots X_n] & = \E[X_1] + \cdots \E[X_n] \\ & = \Pr\{\text{first coin is heads}\}\times 1 + \Pr\{\text{first coin is tails}\}\times 0 +{} \\ & \quad \cdots + \Pr\{\text{$n$th coin is heads}\}\times 1 + \Pr\{\text{$n$th coin is tails}\}\times 0 \\ & = \frac{1}{2}\times 1 + \frac{1}{2}\times 0 + \cdots + \frac{1}{2}\times 1 + \frac{1}{2}\times 0 \\ & = n/2 \end{align*} The 0/1 valued $X_i$ are called Bernoulli or indicator random variables.
If you're interested in a proof of linearity of expectation, here it is: \begin{align*} \E[X+Y] & = \sum_{x}\sum_y (x+y)\Pr\{\text{$X=x$ and $Y=y$}\} \\ & = \sum_{x}\sum_y x\Pr\{\text{$X=x$ and $Y=y$}\} + \sum_{x}\sum_y y\Pr\{\text{$X=x$ and $Y=y$}\} \\ & = \sum_{x}\sum_y x\Pr\{\text{$X=x$ and $Y=y$}\} + \sum_y\sum_{x} y\Pr\{\text{$X=x$ and $Y=y$}\} \\ & = \sum_{x}x\sum_y \Pr\{\text{$X=x$ and $Y=y$}\} + \sum_y y\sum_{x} \Pr\{\text{$X=x$ and $Y=y$}\} \\ & = \sum_{x} x\Pr\{\text{$X=x$}\} + \sum_y y\Pr\{\text{$Y=y$}\} \notag \\ & = \E[X] + \E[Y] \notag \end{align*} Not very enlightening. The key step is where the number of summation symbols goes from 4 down to 2. This is possible, because when we sum, over all $y$, $\Pr\{\text{$X=x$ and $Y=y$}\}$ we get $\Pr\{X=x\}$. The Venn diagram below shows an example of this. Here the event $X=x$ is shown as a pink circle. The different possible values of $Y$ are shown as blobs and they have to cover the pink circle because they cover all of $U$.
More Indicator Variable Examples
Indicator variables, along with linearity of expectation are an extremely efficient way of computing expected values.
Balls in Urns
Throw $n$ balls into $m$ urns in such a way that each ball is equally likely to land in any urn. (Exercise: Check, using the definition of a probability function, that this implies $\Pr\{\text{ball $i$ lands in urn 1}\}=1/m$). What is the expected number of balls in the first urn?
Define \[ I_i = \begin{cases} 1 & \text{if ball $i$ lands in urn 1} \\ 0 & \text{otherwise} \end{cases} \] Then \[ \E[I_i] = \Pr\{\text{ball $i$ lands in urn 1}\}\times 1 = 1/m \] So \[ \E[I_1+\cdots I_n] = \E[I_1] + \cdots \E[I_n] = n/m \] Done.
Records
Take a random permutation $x_1,\ldots,x_n$ of $1,\ldots, n$. Call $x_i$ a record if $x_i=\max\{x_1,\ldots,x_i\}$. What is the expected number of records?
Define \[ I_i = \begin{cases} 1 & \text{if $x_i=\max\{x_1,\ldots,x_i\}$} \\ 0 & \text{otherwise} \end{cases} \] The largest value among $x_1,\ldots,x_i$ is equally likely to be any of $x_1,\ldots,x_i$, so $\Pr\{x_i=\max\{x_1,\ldots,x_i\}\} = 1/i$. So, \[ \E[I_i] = 1/i \] So, the expected number of records is \[ \E[I_1+\cdots+I_n] = \E[I_1] + \cdots \E[I_n] = 1 + 1/2 + 1/3+\cdots 1/n \] The sum $1+1/2+\cdots 1/n$ is called the $n$th harmonic number, $H_n$. Using the fact that $\int_1^n 1/x\,dx = \ln n$, We can easily show that $\ln n \le H_n \le \ln n+1$: