Below is a brief discussion on the weak law of large numbers, a very standard result in probability. I like the proof because of its brevity. The statement of the theorem is as follows.

Let $X_1,X_2,\ldots$ be a sequence of *iid* random variables. Define

$|X_n - X| < \epsilon$ is the event that $X_n$ deviates from the random variable $X$ in magnitude by not more than $\epsilon$.

$p_n(\epsilon)$ is the probability of such an event.

Let $\delta \in (0,1)$ and let $\epsilon > 0$ be given.

*Convergence in probability* means that there exists an $N$ such that

$X_n$ is then said to converge to $X$ _in probability*. This is denoted as

$X_n \overset{p}{\to} X .$In words, if one wants to permit $X_n$ to deviate from $X$ by less than an $\epsilon-$margin with at least $[(1-\delta)\cdot 100]\%$ certainty, there will always exist an N which achieves this *for all* $n >N$ (assuming the $X_n$’s are *iid*).

Let $X$ be a random variable and let $X_1,X_2,\ldots$ be an infinite sequence of i.i.d. copies of $X$. Define

$\overline{X_n} := \frac{\sum_{i=1}^n X_i}{n} .$Then,

$\overline{X_n} \overset{p}{\to} \mu := \mathrm{E}[X].$The proof hinges on the well-known tail-bound,

$\mathrm{P} (h(X)\geq a) \leq \frac{\mathrm{E} h(X)}{a} .$Where $h\geq 0$.

Let $X = \overline{X_n}$ and $h(\overline{X_n}) = (\overline{X_n} - \mu)^2$ .

Then,

$\begin{aligned} \mathrm{P}\big(|\overline{X_n} - \mu| < \epsilon\big) &= \mathrm{P}\big((\overline{X_n} -\mu)^2 < \epsilon^2\big) \\ &= 1 - \mathrm{P}\big((\overline{X_n} -\mu)^2 \geq \epsilon^2\big)\\ &\geq 1 - \frac{\mathrm{E}(\overline{X_n} - \mu)^2}{\epsilon^2} \\ &= 1 - \frac{1}{n}\cdot \frac{\sigma^2}{\epsilon^2}\end{aligned}$Note the use of the i.i.d. assumption in the penultimate step where $\text{Var} \overline{X_n} =\frac{\sigma^2}{n}$ .

The term

$- \frac{1}{n}\cdot \frac{\sigma^2}{\epsilon^2}$must be bounded below by $-\delta$ in order to obtain the desired inequality

$1 - \frac{1}{n}\cdot \frac{\sigma^2}{\epsilon^2} \geq 1- \delta .$The only factor free to be altered is $n$ and

$-\frac{1}{n}\cdot \frac{\sigma^2}{\epsilon^2} \geq - \delta \iff n > \frac{\sigma^2}{\epsilon^2 \delta} .$Operationally then, given $\epsilon$, $\delta$, and $\sigma^2$, the weak law of large numbers tells you how large $n$ needs to be in order to fall within $\epsilon$ of $X$ with probability $1-\delta$.