import numpy as np
from numpy.random import default_rng
rng = default_rng(42)
from scipy.stats import norm
import matplotlib as mpl
from matplotlib import pyplot as plt
import plotly.graph_objects as go
mpl.rcParams['font.size'] = 18
Let $X_1, \dots, X_n$ be IID with finite mean $\mu = E(X_1)$ and finite variance $\sigma^2 = V(X_1)$. Let $\bar{X}_n$ be the sample mean and let $S_n^2$ be the sample variance.
(a) Show that $E(S^2_n) = \sigma^2$.
(b) Show that $S^2_n \xrightarrow{P} \sigma ^ 2$. Hint: Show that $S^2_n = c_n n ^ {-1} \sum_{i = 1}^n X_i^2 - d_n\bar{X}_n^2$ where $c_n \rightarrow 1$ and $d_n \rightarrow 1$. Apply the law of large numbers to $n ^ {-1} \sum_{n=1}^n X_i^2$ and to $\bar{X}_n$. Then use part (e) of Theorem 5.5.
Solution:
(a) In the following, we denote $\sum_{i=1}^n$ by $\sum_i$. Recall $E[X_i] = \mu^2 + \sigma^2$, and $E[\bar{X}_n] = \mu^2 + \sigma^2/n$. Observe:
\begin{align*} E \left[ \sum_i \left( X_i - \bar{X}_n \right) ^ 2 \right] &= E \left[ \sum_i X_i^2 - 2 X_i \bar{X}_n + \bar{X}_n^2 \right] \tag{expanding summand} \\ &= \sum_i E \left[X_i^2 \right] - 2 E \left[\sum_i X_i \bar{X}_n \right] + E \left[\sum_i \bar{X}_n^2 \right] \tag{linearity of $E$} \\ &= \sum_i (\mu^2 + \sigma^2) - 2 E \left[ \bar{X}_n \sum_i X_i \right] + E \left[n \bar{X}_n^2 \right] \\ &= n(\mu^2 + \sigma^2) - 2n E\left[ \bar{X}_n^2 \right] + n E \left[\bar{X}_n^2 \right] \\ &= n(\mu^2 + \sigma^2) - n E\left[ \bar{X}_n^2 \right] \\ &= n(\mu^2 + \sigma^2) - n(\mu^2 + \sigma^2 / n) = (n-1)\sigma^2 \end{align*}Hence
$$E \left[ S_n^2 \right] = E \left[ \frac{1}{n-1} \sum_i \left( X_i - \bar{X}_n \right) ^ 2 \right] = \sigma^2$$(b)
\begin{align*} S_n^2 &= \frac{1}{n-1} \sum_i (X_i - \bar{X}_n)^2 \\ &= \frac{1}{n-1} \sum_i \left[ X_i^2 - 2X_i \bar{X}_n + \bar{X}_n^2 \right]\\ &= \frac{1}{n-1} \left[ \sum_i \left[X_i^2\right] - 2 \bar{X}_n \sum_i \left[ X_i \right] + \sum_i \left[\bar{X}_n^2 \right] \right] \\ &= \frac{n}{n-1} \frac{1}{n} \left[ \sum_i \left[X_i^2\right] - n \bar{X}_n^2+ n\bar{X}_n^2+ n\bar{X}_n^2+ n\bar{X}_n^2+ n\bar{X}_n^2 \right] \\ &= c_n \frac{1}{n} \sum_i \left[X_i^2\right] - d_n \bar{X}_n^2 \end{align*}where $c_n = d_n = \frac{n}{n-1}$. Note the following:
Since convergence in distribution to a point mass implies convergence in probability, we have the result.
Let $X_1, X_2, \dots$ be a sequence of random variables. Show that $X_n \xrightarrow{qm} b$ if and only if
$$\lim_{n\to\infty} E(X_n) = b \text{ and } \lim_{n\to\infty} V(X_n) = 0.$$Solution:
$\Rightarrow$ By Jensen's inequality, $E\left[ -(X_n - b)^2 \right] \le E(X_n) - b \le E\left[ (X_n - b)^2 \right]$, so by the squeeze theorem, $\lim_{n\to\infty} E(X_n) = b$.
Now,
\begin{align*} \lim_{n\to\infty} V(X_n) &= \lim_{n\to\infty}E(X_n^2) - \lim_{n\to\infty} E(X_n)^2 \\ &= \lim_{n\to\infty}E(X_n^2) - \lim_{n\to\infty} E(X_n) \lim_{n\to\infty} E(X_n) \\ &= \lim_{n\to\infty}E(X_n^2) - b \lim_{n\to\infty} E(X_n) \\ &= \lim_{n\to\infty}E(X_n^2) - 2b \lim_{n\to\infty} E(X_n) + b \lim_{n\to\infty} E(X_n)\\ &= \lim_{n\to\infty}\left[E(X_n^2) - 2b E(X_n) + E(b^2) \right] \\ &= \lim_{n\to\infty} E \left[ X_n^2 - 2bE(X_n) + b^2 \right] \\ &= \lim_{n\to\infty} E \left[(X_n - b)^2 \right] = 0 \end{align*}$\Leftarrow$
\begin{align*} \lim_{n\to\infty} E \left (X_n - b)^2 \right] &= \lim_{n\to\infty} \left[ E(X_n^2) - 2bE(X_n) + b^2 \right] \\ &= \lim_{n\to\infty} \left[ [E(X_n)^2 + V(X_n)] - 2bE(X_n) + b^2 \right] \\ &= (b^2 + 0) - 2b^2 + b^2 = 0 \end{align*}Let $X_1, \dots, X_n$ be IID and let $\mu = E(X_1)$. Suppose that the variance is finite. Show that $\bar{X}_n \xrightarrow{qm} \mu$.
Solution:
We have
\begin{align*} E \left[ (\bar{X}_n - \mu)^2 \right] &= E(\bar{X}_n^2) - 2 \mu E(\bar{X}_n) + \mu^2 \\ &= E(\bar{X}_n)^2 + \frac{\sigma^2}{n} - 2 \mu E(\bar{X}_n) + \mu^2 \\ &\rightarrow \mu^2 + 0 - 2\mu^2 + \mu^2 = 0 \end{align*}Let $X_1, X_2, \dots$ be a sequence of random variables such that
$$P \left( X_n = \frac1n \right) = 1 - \frac{1}{n^2} \text{ and } P(X_n = n) = \frac{1}{n^2}$$Does $X_n$ converge in probability? Does $X_n$ converge in quadratic mean?
Solution:
$X_n \xrightarrow{P} 0$. To see this, let $\epsilon > 0$. Let $n \ge \left\lceil \frac{1}{\epsilon} \right\rceil$. Then, $P(|X_n - 0| > \epsilon) = P(X_n = n) = \frac{1}{n^2} \rightarrow 0$ as $n \rightarrow \infty$.
However, $X_n \not\xrightarrow{qm} 0$, since \begin{align*} E \left[ (X_n - 0)^2 \right] &= \frac{1}{n^2}P(X_n = \frac{1}{n}) + n^2 P(X_n = n) \\ &= \frac{1}{n^2} \cdot \left(1 - \frac{1}{n^2}\right) + n^2 \cdot \frac{1}{n^2} \\ &= \frac{1}{n^2} - \frac{1}{n^4} + 1 \\ &\to 1 \ne 0 \end{align*}
Let $X_1, \dots, X_n \sim \text{Bernoulli}(p)$. Prove that
$$\frac1n \sum_{i=1}^n X_i^2 \xrightarrow{P} p \quad \text{and} \quad \frac1n \sum_{i=1}^n X_i^2 \xrightarrow{qm} p$$Solution:
Note that as convergence in probability is implied by convergence in quadratic mean, it suffices to show the latter. Note that $X_i^k = X_i$ for all $k \in \{1, 2, \dots\}$.
\begin{align*} E \left[ \left( \frac{1}{n} \sum_i X_i^2 - p \right) ^2 \right] &= E \left[ \left( \frac{1}{n} \sum_i X_i^2 \right) ^ 2 \right] - 2 E \left[ \frac1n \sum_i X_i^2 \right] p + p^2 \\ &= \frac{1}{n^2} \sum_i E[X_i^4] + \frac{1}{n^2} \sum_i \sum_{j \ne i} E[X_i^2]E[X_j^2] - 2 E \left[ \frac1n \sum_i X_i^2 \right] p + p^2 \\ &= \frac{1}{n} E[X_1] + \frac{n(n-1)}{n^2}E[X_1]E[X_2] - 2pE[X_1] + p^2\\ &= \frac{p}{n} + \frac{n(n-1)}{n^2}p^2 - 2p^2 + p^2 \rightarrow p^2 - 2p^2 + p^2 = 0 \end{align*}Suppose that the height of men has mean 68 inches and standard deviation 2.6 inches. We draw 100 men at random. Find (approximately) the probability that the average height of men in our sample will be at least 68 inches.
Solution:
\begin{align*} P(\bar{X} \ge 68) &= P\left(\frac{\sqrt{100}(\bar{X} - 68)}{2.6} \ge 0\right) \\ &\approx 1 - F(0) = \frac12 \end{align*}Let $\lambda = \frac{1}{n}$ for $n = 1, 2, \dots,$. Let $X_n \sim \text{Poisson}(\lambda)$.
(a) Show that $X_n \xrightarrow{P} 0$.
(b) Let $Y_n = nX_n$. Show that $Y_n \xrightarrow{P} 0$.
Solution:
(a) Let $\epsilon > 0$. Then
\begin{align*} P(|X_n - 0| > \epsilon) &= 1 - P(|X_n - 0| \le \epsilon)\\ &\le 1 - P(X_n = 0)\\ &= 1 - e^{-1/n} \frac{(1/n)^0}{0!} \\ &= 1 - e^{-1/n} \rightarrow 0 \text{ as } n\to\infty \end{align*}(b) \begin{align*} P(|Y_n - 0| > \epsilon) &= 1 - P(|Y_n - 0| \le \epsilon)\\ &\le 1 - P(Y_n = 0)\\ &= 1 - e^{-1/n} \frac{(1/n)^0}{0!} \\ &= 1 - e^{-1/n} \rightarrow 0 \text{ as } n\to\infty \end{align*}
Suppose we have a computer program consisting of $n = 100$ pages of code. Let $X_i$ be the number of errors on the $i^{\text{th}}$ page of code. Suppose that the $X_i$'s are Poisson with mean 1 and that they are independent. Let $Y = \sum_{i=1}^n X_i$ be the total number of errors. Use the central limit theorem to approximate $P(Y < 90)$.
Solution:
We have $n=100$, $\mu=1$, and $\sigma=1$, and
\begin{align*} P(Y < 90) &= P(\bar{X} < 90) \\ &= P\left(\frac{\sqrt{n}(\bar{X} - \mu)}{\sigma} < \frac{\sqrt{n}(90/n - \mu)}{\sigma}\right)\\ &= P\left(\frac{\sqrt{n}(\bar{X} - \mu)}{\sigma} < -1\right)\\ &\approx P(Z < -1) \approx 0.1587 \end{align*}Suppose that $P(X=1) = P(X = -1) = 1/2$. Define
$$X_n = \begin{cases} X & \text{with probability } 1 - \frac1n \\ e^n & \text{with probability } \frac1n. \end{cases} $$Does $X_n$ converge to $X$ in probability? Does $X_n$ converge to $X$ in distribution? Does $E(X - X_n)^2$ converge to 0?
Solution:
$X_n \xrightarrow{P} X$, since, for all $\epsilon > 0$, and for all $n \ge 2$, $P(|X_n - X| > \epsilon) = P(X_n = e^n) = \frac1n \rightarrow 0$.
Since $X_n \xrightarrow{P} X$, $X_n \leadsto X$.
However,
\begin{align*} E(X - X_n)^2 &= E[(X - e^n)^2\mathbb{1}_{X_n \ne X}]\\ &= E[(X^2 -2Xe^n + e^{2n})\mathbb{1}_{X_n \ne X}]\\ &= \frac{1 + e^{2n}}{n} \rightarrow \infty \end{align*}Let $Z \sim N(0,1)$. Let $t > 0$. Show that, for any $k > 0$,
$$P(|Z| > t) \le \frac{E|Z|^k}{t^k}.$$Compare this to Mill's inequality in Chapter 4.
Solution:
Letting $\phi(z)$ denote the PDF of the standard normal distribution, we have
\begin{align*} \frac{E|Z|^k}{t^k} &= \int_{-\infty}^{\infty} \frac{|z|^k}{t^k} \phi(z) \, dz \\ &\ge \int_{|z| > t} \phi(z) \, dz \\ &= P(|Z| > t) \end{align*}Mill's inequality:
$$P(|Z| > t) \le \sqrt{\frac{2}{\pi}} \frac{e^{-t^2 / 2}}{t}.$$whereas:
\begin{align*} \frac{E|Z|^k}{t^k} &= \int_{-\infty}^{\infty} \frac{|z|^k}{t^k} \phi(z) \, dz \\ &= \sqrt{\frac{2}{\pi} }\int_0^\infty \frac{z^k}{t^k} e^{-z^2 / 2} \, dz \\ \end{align*}Suppose that $X_n \sim N(0, \frac1n)$ and let $X$ be a random variable with distribution $F(x) = 0$ if $x < 0$ and $F(x) = 1$ if $x \ge 0$. Does $X_n$ converge to $X$ in probability? (Prove or disprove). Does $X_n$ converge to $X$ in distribution? (Prove or disprove)
Solution:
Observe $V(X_n) = \frac{1}{n}$, and that $X = 0$ almost surely. Since
\begin{align*} P(|X_n - X| \ge \epsilon) &= P(|X_n| \ge \epsilon) \tag{$X = 0$ a.s.} \\ &\le \frac{1}{n\epsilon^2} \rightarrow 0 \text{ as } n \rightarrow \infty\tag{Chebyshev's inequality} \\ \end{align*}$X_n$ converges to $X$ in probability, and, hence, as well as in distribution.
Let $X, X_1, X_2, X_3, \dots$ be random variables that are positive and integer valued. Show that $X_n \leadsto X$ if and only if
$$\lim_{n\to\infty} P(X_n = k) = P(X = k)$$for every integer $k$.
Solution:
$\Rightarrow$. We have $\lim_{n\to\infty} F_n(t) = F(t)$ at all continuity points of $F$. Observe that since $X$ is integer-valued, $F(x)$ is continuous (and constant) for all $x \in \mathbb{R} \backslash \mathbb{Z}$. Let $\epsilon \in (0,1)$. Then, for any integer $k$,
\begin{align*} \lim_{n\to\infty} P(X_n = k) &= \lim_{n\to\infty} \left[ F_n(k+\epsilon) - F_n(k - \epsilon) \right] \\ &=F(k+\epsilon) - F(k - \epsilon) \\ &= P(X = k) \end{align*}$\Leftarrow$: We have $\lim_{n\to\infty} P(X_n = k) = P(X = k)$. For any $t$, we have \begin{align*} \lim_{n\to\infty} F_n(t) &= \lim_{n\to\infty} \sum_{k \le t} P(X_n = k) \\ &= \sum_{k \le t} P(X = k) \\ &= F(t) \end{align*}
Let $Z_1, Z_2, \dots$ be IID random variables with density $f$. Suppose that $P(Z_i > 0) = 1$ and that $\lambda = \lim_>>{x\to0^+} f(x) > 0$. Let
$$X_n = n \min \{ Z_1, \dots, Z_n \}.$$Show that $X_n \leadsto Z$ where $Z$ has an exponential distribution with mean $\frac{1}{\lambda}$.
Solution:
Observe: If $x > 0$, \begin{align*} P(X_n > x) &= P \left( \bigcap_{i=1}^n nZ_i > x \right) \\ &= P \left( Z_1 > \frac{x}{n} \right)^n \tag{independence} \\ &= \left[1 - F \left(\frac{x}{n} \right) \right]^n \\ &= \left[1 - \left(F(0) + F'(0) \frac{x}{n} + O(n^{-2}) \right) \right]^n \tag{Taylor's theorem} \\ &= \left[1 - \left(\frac{\lambda x}{n} + O(n^{-2}) \right) \right]^n \rightarrow e^{-\lambda x}. \end{align*}
otherwise, $P(X_n > x) = 1$. Thus, $F_n(x) \rightarrow (1 - e^{-\lambda x})\mathbb{1}_{(0,\infty)}$, which is the CDF of an exponential distribution with mean $\frac{1}{\lambda}$.
Let $X_1, \dots, X_n \sim \text{Uniform}(0,1)$. Let $Y_n = \bar{X}_n^2$. Find the limiting distribution of $Y_n$.
Solution:
Recall $E[X_i] = \mu = \frac12$ and $V[X_i] = \sigma^2 = \frac{1}{12}$. By the Central Limit Theorem,
$$ \bar{X}_n \approx N \left( \mu, \frac{\sigma^2}{n} \right)$$The Delta Method says that for any differentiable $g$ such that $g'(\mu) \ne 0$,
$$ g(\bar{X}_n) \approx N \left( g(\mu), (g'(\mu))^2 \frac{\sigma^2}{n} \right) $$Setting $g(t) = t^2$, we have $g(\bar{X}_n) = Y_n$, $g(\mu) = \frac14$ and $g'(\mu) = 1$, hence
$$ Y_n \approx N \left( \frac14, \frac{\sigma^2}{n} \right) $$which approaches a point mass distribution at $\frac14$ as $n \to \infty$.
Let
$$\begin{pmatrix} X_{11} \\ X_{21} \\ \end{pmatrix}, \begin{pmatrix} X_{1n} \\ X_{2n} \\ \end{pmatrix}, \dots,\begin{pmatrix} X_{1n} \\ X_{2n} \\ \end{pmatrix} $$be IID random vectors with mean $\mu = (\mu_1, \mu_2)$ and variance $\Sigma$. Let
$$\bar{X}_1 = \frac{1}{n} \sum_{i=1}^n X_{1i}, \quad \bar{X}_2 = \frac{1}{n} \sum_{i=1}^n X_{2i}$$and define $Y_n = \bar{X}_1 / \bar{X}_2$. Find the limiting distribution of $Y_n$.
Solution:
We employ the multivariate delta method. Let $g(s_1, s_2) = s_1 / s_2$, so that $Y_n = g(\bar{X}_1, \bar{X}_2)$. We have
\begin{align*} \nabla g(s) = \begin{pmatrix} \frac{\partial g}{\partial s_1} \\ \frac{\partial g}{\partial s_2} \end{pmatrix} =\begin{pmatrix} \frac{1}{s_2} \\ -\frac{s_1}{s_2^2} \end{pmatrix} \Rightarrow \nabla_{\mu} = \nabla g(\mu) = \begin{pmatrix} \frac{1}{\mu_2} \\ -\frac{\mu_1}{\mu_2^2} \end{pmatrix} \end{align*}and we have
$$\sqrt{n}(Y_n - \frac{\mu_1}{\mu_2}) \leadsto N(0, \nabla_{\mu}^T \Sigma \nabla_{\mu})$$where
\begin{align*} \nabla_{\mu}^T \Sigma \nabla_{\mu} = \frac{\sigma_{11}}{\mu_2^2} - \frac{\sigma_{12}\mu_1}{\mu_2^3} - \frac{\sigma_{21}\mu_1}{\mu_2^3} + \frac{\sigma_{22}\mu_1^2}{\mu_2^4}. \end{align*}Construct an example where $X_n \leadsto X$ and $Y_n \leadsto Y$ but $X_n + Y_n$ does not converge in distribution to $X + Y$.
Solution:
Let $X_n = Y_n = X = \text{Uniform}(-1,1)$, and $Y = -X$. Trivially, $X_n \leadsto X$ and $Y_n \leadsto Y$, but $X_n + Y_n$ does not converge in distribution to $X + Y = 0$.