In [1]:
import math
from functools import lru_cache

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from scipy.stats import chi2, norm
from scipy.optimize import fsolve

plt.style.use("default.mplstyle")

1¶

Prove Theorem 15.2

The following statements are equivalent:

  1. $Y \amalg Z$.
  2. $\psi = 1$.
  3. $\gamma = 0$.
  4. For $i,j \in \{0, 1\}$, $p_{ij} = p_{i\cdot}p_{\cdot j}.$

Where $Y$ and $Z$ are both binary random variables, $A \amalg B$ means the random variables $A$ and $B$ are independent, $\psi$ is the odds ratio:

$$\psi = \frac{p_{00}p_{11}}{p_{01}p_{10}},$$

where $p_{ij} = \mathbb{P}(Z = i, Y = j)$, $\gamma = \log(\psi)$ is the log odds ratio, and dotted subscripts denote summation over the corresponding indices.

Solution:

$(1) \Rightarrow (2)$: Suppose $Y \amalg Z$. Then $p_{ij} = \mathbb{P}(Z = i, Y=j) = \mathbb{P}(Z=i)\mathbb{P}(Y=j)$. Thus,

\begin{align*} \psi &= \frac{p_{00}p_{11}}{p_{01}p_{10}} \\ &= \frac{ \mathbb{P}(Z = 0, Y=0) \mathbb{P}(Z = 1, Y=1)}{ \mathbb{P}(Z = 0, Y=1) \mathbb{P}(Z = 1, Y=0)} \\ &= \frac{ \mathbb{P}(Z = 0) \mathbb{P}(Y=0) \mathbb{P}(Z = 1) \mathbb{P}(Y = 1)}{\mathbb{P}(Z = 0)\mathbb{P}(Y = 1)\mathbb{P}(Z = 1)\mathbb{P}(Y = 0)} = 1. \end{align*}

$(2) \Rightarrow (1)$:

\begin{align*} \psi &= \frac{\text{odds}(Y \mid Z)}{\text{odds}(Y \mid Z^c)} = 1\\ &\Rightarrow \text{odds}(Y \mid Z) = \text{odds}(Y \mid Z^c) \\ &\Rightarrow \mathbb{P}(Y \mid Z) = \mathbb{P}(Y \mid Z^c) \Rightarrow Y \amalg Z \end{align*}

Since, if $\mathbb{P}(Y \mid Z) = \mathbb{P}(Y \mid Z^c)$,

\begin{align*} \mathbb{P}(Y)\mathbb{P}(Z) &= [\mathbb{P}(Y \mid Z)\mathbb{P}(Z) + \mathbb{P}(Y \mid Z^c)\mathbb{P}(Z^c)]\mathbb{P}(Z) \\ &= \mathbb{P}(Y \mid Z)\mathbb{P}(Z) = \mathbb{P}(Y, Z) \end{align*}

$(1) \Rightarrow (4)$: Let $i,j \in \{0, 1\}$. Then,

\begin{align*} p_{i\cdot}p_{\cdot j} &= (p_{i0} + p_{i1})(p_{0j} + p_{1j}) \\ &= p_{i0}p_{0j} + p_{i0}p_{1j} + p_{i1}p_{0j} + p_{i1}p_{1j} \\ &= \mathbb{P}(Z = i) \mathbb{P}(Y = 0)\mathbb{P}(Z = 0)\mathbb{P}(Y = j) + \mathbb{P}(Z = i)\mathbb{P}(Y = 0)\mathbb{P}(Z = 1)\mathbb{P}(Y = j) \\ &+ \mathbb{P}(Z = i)\mathbb{P}(Y = 1)\mathbb{P}(Z = 0)\mathbb{P}(Y = j) + \mathbb{P}(Z = i)\mathbb{P}(Y = 1)\mathbb{P}(Z = 1)\mathbb{P}(Y = j) \\ &= \mathbb{P}(Z = i)\mathbb{P}(Y = j) \mathbb{P}(Z = 0)\mathbb{P}(Y = 0) + \mathbb{P}(Z = 0)\mathbb{P}(Y = 1) +\mathbb{P}(Z = 1)\mathbb{P}(Y = 0) + \mathbb{P}(Z = 1)\mathbb{P}(Y = 1) \\ &= p_{ij}[p_{00} + p_{01} + p_{10} + p_{11}] = p_{ij} \end{align*}

$(4) \Rightarrow (1)$:

\begin{align*} \mathbb{P}(Z = i, Y = j) &= p_{ij} \\ &= p_{i\cdot}p_{\cdot j} \tag{4} \\ &= (p_{i0} + p_{i1})(p_{0j} + p_{1j}) \\ &= p_{i0}p_{0j} + p_{i0}p_{1j} + p_{i1}p_{0j} + p_{i1}p_{1j} \\ &= [p_{i0}\mathbb{P}(Z = 0 \mid Y = j) + p_{i0}\mathbb{P}(Z = 1 \mid Y = j) \\ &+ p_{i1}\mathbb{P}(Z = 0 \mid Y = j) + p_{i1}\mathbb{P}(Z = 1 \mid Y = j)] \mathbb{P}(Y = j) \\ &= [p_{i0} + p_{i1}]\mathbb{P}(Y = j) \\ &= \mathbb{P}(Z = i)\mathbb{P}(Y = j) \end{align*}

$(2) \iff (3)$: Follows from $\log(1) = 0$.

2¶

Prove Theorem 15.3:

Consider testing

$$H_0: Y \amalg Z \text{ versus } H_1 : Y \not\amalg$$

The likelihood ratio test statistic for (15.3) is

$$T = 2 \sum_{i=0}^1 \sum_{j=1}^1 X_{ij} \log \left( \frac{X_{ij}X_{\cdot\cdot}}{X_{i\cdot}X_{\cdot j}}\right).$$

Under $H_0 : T \leadsto \chi_1^2$. Thus, an approximate level $\alpha$ test is obtained by rejecting $H_0$ when $T > \chi_{1, \alpha}^2$.

Solution:

Recall the likelihood ratio statistic. We consider testing

$$H_0: \theta \in \Theta_0 \text{ versus } H_1: \theta \notin \Theta_0$$

The likelihood ratio statistic is

$$T = 2 \log \left( \frac{\mathcal{L}(\hat{\theta})}{\mathcal{L}(\hat{\theta}_0)} \right) = 2 \left[\mathcal{l}(\hat{\theta}) - \mathcal{l}(\hat{\theta}_0) \right]$$

where $\hat{\theta}$ is the MLE, and $\hat{\theta}_0$ is the MLE restricted to $\Theta_0$.

In our case, the MLE is $\hat{p} = X / n$, and $\Theta_0$ is the set of $p$ such that $p_{i\cdot}p_{\cdot j} = p_{ij}$. By the equivariance of the MLE, $\hat{\theta}_0 = \hat{p}_{i \cdot} \hat{p}_{\cdot j} = \frac{X_{i \cdot}}{n} \frac{X_{\cdot j}}{n}$.

We have

$\mathcal{L}(p) \propto \prod_{i=0}^1 \prod_{j=0}^1 p_{ij}^{X_{ij}},$

and so

$\mathcal{l}(p) = \sum_{i=0}^1 \sum_{j=0}^1 X_{ij} \log(p_{ij})$

therefore,

\begin{align*} T &= 2 \left[\sum_{i=0}^1 \sum_{j=0}^1 X_{ij} \log \left(\frac{X_{ij}}{n}\right) - \sum_{i=0}^1 \sum_{j=0}^1 X_{ij} \log(\frac{X_{i \cdot}}{n} \frac{X_{\cdot j}}{n}) \right] \\ &=2 \left[\sum_{i=0}^1 \sum_{j=0}^1 X_{ij} \log \left( \frac{X_{ij}X_{\cdot \cdot}}{X_{i \cdot}X_{\cdot j}}\right) \right]. \end{align*}

Observe that there are three degrees of freedom under $H_1$ (after choosing $p_{00}$, $p_{01}$, and $p_{10}$, $p_{11}$ is uniquely determined), but only two degrees of freedom under $H_0$ (after choosing $p_{0 \cdot}$ and $p_{\cdot 0}$, all $p_{ij}$ are uniquely determined). The result follows from Theorem 10.22.

3¶

Prove Theorem 15.6.

The MLE's of $\psi$ and $\gamma$ are

$$\hat{\psi} = \frac{X_{00}X_{11}}{X_{01}X_{10}}, \quad \hat{\gamma} = \log \hat{\psi}.$$

The asymptotic standard errors (computed using the delta method) are

$$\hat{\text{se}}(\hat{\gamma}) = \sqrt{\frac{1}{X_{00}} + \frac{1}{X_{01}} + \frac{1}{X_{10}} + \frac{1}{X_{11}}}$$$$\hat{\text{se}}(\hat{\psi}) = \hat{\psi}\hat{\text{se}}(\hat{\gamma})$$

Solution:

That $\hat{\psi}$ and $\hat{\gamma}$ are the MLEs of $\psi$ and $\gamma$ follow from equivariance of the MLE and the fact that the MLE of $p$ is $X / n$.

We now use the delta method to derive the standard errors. We first compute the information:

\begin{align*} \mathcal{L}(p) &\propto \prod_{i=0}^1\prod_{j=0}^1 p_{ij}^{X_{ij}} \\ \Rightarrow \mathcal{l}(p) &= \sum_{i=0}^1\sum_{j=0}^1 X_{ij} \log(p_{ij}) \\ \Rightarrow \nabla \mathcal{l}(p) &= \begin{bmatrix} X_{00} / p_{00} \\ X_{01} / p_{01} \\ X_{10} / p_{10} \\ X_{11} / p_{11} \\ \end{bmatrix} \\ \Rightarrow \nabla^2 \mathcal{l}(p) &= - \begin{bmatrix} X_{00} / p_{00}^2 & 0 & 0 & 0\\ 0 & X_{01} / p_{01}^2 & 0 & 0 \\ 0 & 0 & X_{10} / p_{10}^2 & 0 \\ 0 & 0 & 0 & X_{11} / p_{11}^2 \\ \end{bmatrix} \\ \Rightarrow I_n(p) &= - \mathbb{E}(D^2 \mathcal{l}(p)) = n \begin{bmatrix} 1 / p_{00} & 0 & 0 & 0\\ 0 & 1 / p_{01} & 0 & 0 \\ 0 & 0 & 1 / p_{10} & 0 \\ 0 & 0 & 0 & 1 / p_{11} \\ \end{bmatrix} \\ \end{align*}

The inverse of the information is:

$$ J_n(p) = \frac{1}{n} \begin{bmatrix} p_{00} & 0 & 0 & 0\\ 0 & p_{01} & 0 & 0 \\ 0 & 0 & p_{10} & 0 \\ 0 & 0 & 0 & p_{11} \\ \end{bmatrix} $$

whence

$$ \hat{J}_n = J_n(\hat{p}) = \frac{1}{n^2} \begin{bmatrix} X_{00} & 0 & 0 & 0\\ 0 & X_{01} & 0 & 0 \\ 0 & 0 & X_{10} & 0 \\ 0 & 0 & 0 & X_{11} \\ \end{bmatrix}. $$

Let $g(p) = \gamma = \log\left(\frac{p_{00}p_{11}}{p_{01}p_{10}}\right) = \log p_{00} + \log p_{11} - \log p_{01} - \log p_{10}$. Then

\begin{align*} \nabla g(p) = \begin{bmatrix} 1 / p_{00} \\ -1 / p_{01} \\ -1 / p_{10} \\ 1 / p_{11} \end{bmatrix} \end{align*}

whence

\begin{align*} \hat{\nabla}g = \nabla g(\hat{p}) = n \begin{bmatrix} 1 / X_{00} \\ -1 / X_{01} \\ -1 / X_{10} \\ 1 / X_{11} \end{bmatrix} \end{align*}

Then

\begin{align*} \hat{\text{se}}(\hat{\gamma}) &= \sqrt{(\hat{\nabla}g)^T \hat{J}_n (\hat{\nabla}g)} \\ \Rightarrow \hat{\text{se}}(\hat{\gamma}) &= \sqrt{ \frac{1}{X_{00}} + \frac{1}{X_{01}} + \frac{1}{X_{10}} + \frac{1}{X_{11}}}. \end{align*}

Meanwhile, if $g(p) = \psi = \frac{p_{00}p_{11}}{p_{01}p_{10}}$, then

\begin{align*} \nabla g(p) = \begin{bmatrix} \frac{p_{11}}{p_{01}p_{10}} \\ -\frac{p_{00}p_{11}}{p_{01}^2p_{10}} \\ -\frac{p_{00}p_{11}}{p_{01}p_{10}^2} \\ \frac{p_{00}}{p_{01}p_{10}} \\ \end{bmatrix} \Rightarrow \hat{\nabla}g = n \begin{bmatrix} \frac{X_{11}}{X_{01}X_{10}} \\ -\frac{X_{00}X_{11}}{X_{01}^2X_{10}} \\ -\frac{X_{00}X_{11}}{X_{01}X_{10}^2} \\ \frac{X_{00}}{X_{01}X_{10}} \\ \end{bmatrix} \end{align*}

Therefore,

\begin{align*} (\hat{\nabla}g)^T \hat{J}_n \hat{\nabla}g = \hat{\psi}^2 \left(\frac{1}{X_{00}} + \frac{1}{X_{01}} + \frac{1}{X_{10}} + \frac{1}{X_{11}}\right), \end{align*}

and thus

$$\hat{\text{se}}(\hat{\psi}) = \hat{\psi}\hat{\text{se}}(\hat{\gamma}).$$

4¶

The New York Times (January 8, 2003, page A12) reported the following data on death sentencing and race, from a study in Maryland:

Death Sentence No Death Sentence
Black Victim 14 641
White Victim 62 594

Analyze the data using the tools from this chapter. Interpret the results. Explain why, based on on this information, you can't make causal conclusions. (The authors of the study did use much more information in their full report.)

In [2]:
data = {"Black Victim": {"Death Sentence": 14, "No Death Sentence": 641}, 
        "White Victim": {"Death Sentence": 62, "No Death Sentence": 594}}
data = pd.DataFrame(data).transpose()
X = data.values

Each of the following tests for association:

  • The likelihood ratio test
  • Pearson's $\chi^2$ test, and
  • The Wald test for $\gamma = 0$

conclusively rejects the null hypothesis of independence, reporting a $p$-values very near zero.

The likelihood ratio test:

In [3]:
T = 0
for i in [0, 1]:
    for j in [0, 1]:
        T += X[i,j] * math.log((X[i, j] * sum(sum(X))) / (sum(X[i,:]) * sum(X[:, j])))
T *= 2

z = chi2.ppf(1 - 0.05, df=1)
print(f"95% quantile for chi-squared distribution with df=1: {z:3f}")
print(f"Likelihood ratio test statistic: {T:.3f}")
print(f"p-value: {1 - chi2.cdf(T, df=1):.12f}")
95% quantile for chi-squared distribution with df=1: 3.841459
Likelihood ratio test statistic: 34.534
p-value: 0.000000004190

Pearon's $\chi^2$ test

In [4]:
n = sum(sum(X))
U = 0

for i in [0, 1]:
    for j in [0, 1]:
        E = sum(X[i, :]) * sum(X[:, j]) / n
        U += ((X[i,j] - E) ** 2) / E

print(f"Pearson's chi^2 test statistic: {U:.3f}")
print(f"p-value: {1 - chi2.cdf(U, df=1):.12f}")
Pearson's chi^2 test statistic: 32.104
p-value: 0.000000014616

Wald test for $\gamma = 0$:

In [5]:
psi_hat = (X[0, 0] * X[1, 1]) / (X[0, 1] * X[1, 0])
gamma_hat = math.log(psi_hat)
print(f"psi_hat: {psi_hat:.3f}")
print(f"gamma_hat: {gamma_hat:.3f}")

# estimated standard error of gamma_hat

se_gamma_hat = math.sqrt((1 / X[0, 0]) + (1 / X[0, 1]) + (1 / X[1, 0]) + (1 / X[1, 1]))
print(f"se_gamma_hat: {se_gamma_hat:.3f}")
W = gamma_hat / se_gamma_hat
p_value = 1- norm.cdf(np.abs(W))
print(f"p-value: {p_value:.9f}")

z = norm.ppf(1 - 0.05 / 2)

print(f"95% C.I. for gamma: ({gamma_hat - z * se_gamma_hat:.3f}, {gamma_hat + z * se_gamma_hat:.3f})")
print(f"95% C.I. for psi: ({np.exp(gamma_hat - z * se_gamma_hat):.3f}, {np.exp(gamma_hat + z * se_gamma_hat):.3f})")
psi_hat: 0.209
gamma_hat: -1.564
se_gamma_hat: 0.301
p-value: 0.000000105
95% C.I. for gamma: (-2.155, -0.974)
95% C.I. for psi: (0.116, 0.378)

We cannot make causal conclusions from these results because these tests only demonstrate evidence of association, and association does not imply causation. For example, it make be the case that there is a confounding variable in the study leading to the results seen.

5¶

Analyze the data on the variables Age and Financial Status from:

http://lib.stat.cmu.edu/DASL/Datafiles/montanadat.html

Solution:

The data come from a 1992 poll of ~200 Montana residents regarding their financial status compared to a year prior. The age of the respondents are also included.

The information is encoded in the following way (from the file):

AGE = 1 under 35, 2 35-54, 3 55 and over
FIN = Financial status 1 worse, 2 same, 3 better than a year ago
In [6]:
!curl -s --insecure https://www.stat.cmu.edu/~larry/all-of-statistics/=data/montana.dat --output data/montana.dat
data = pd.read_csv(
    "data/montana.dat",
    skiprows=33,
    sep="\s+",
    names="AGE SEX INC POL AREA FIN STAT".split(),
    usecols=["AGE", "FIN"],
).query("AGE != '*' & FIN != '*'")
data.head()
Out[6]:
AGE FIN
0 3 2
1 2 3
2 1 2
3 3 1
4 3 2

Both of our tests reject the null hypothesis of independence:

In [7]:
X = data.groupby(["AGE", "FIN"]).size().values.reshape((3,3))
I, J = X.shape

# computing the likelihood ratio test statistic

T = 0
for i in range(I):
    for j in range(J):
        ratio = (X[i, j] * np.sum(X)) / (np.sum(X[i, :]) * np.sum(X[:, j]))
        T += X[i, j] * math.log(ratio)
T *= 2
print(f"T: {T:.3f}")
nu = (I - 1) * (J - 1)
print(f"p-value: {1 - chi2.cdf(T, df=nu):.6f}")
T: 22.064
p-value: 0.000195
In [8]:
# Computing Pearson's chi^2 test statistic

U = 0
for i in range(I):
    for j in range(J):
        E = np.sum(X[i, :]) * np.sum(X[:, j]) / n
        U += ((X[i,j] - E) ** 2) / E

print(f"U: {U:.3f}")
nu = (3 - 1) * (3 - 1)
print(f"p-value: {1 - chi2.cdf(U, df=nu):.6f}")
U: 1060.653
p-value: 0.000000

We can conclude there is an association between a respondent's age and their assessment of their financial status compared to a year prior.

6¶

Estimate the correlation between temperature and latitude using the data from

http://lib.stat.cmu.edu/DASL/Datafiles/USTemperatures.html

Use the correlation coefficient. Provide estimates, tests, and confidence intervals.

In [9]:
!curl -s --insecure https://www.stat.cmu.edu/~larry/all-of-statistics/=data/temp.dat --output data/temp.dat
data = pd.read_csv("data/temp.dat", 
                   skiprows=31,
                   sep="\t")
data.head()
Out[9]:
City JanTemp Lat Long
0 Mobile, AL 44 31.2 88.5
1 Montgomery, AL 38 32.9 86.8
2 Phoenix, AZ 35 33.6 112.5
3 Little Rock, AR 31 35.4 92.8
4 Los Angeles, CA 47 34.3 118.7
In [10]:
# computing the sample correlation using the nonparametric plug-in estimator
Y = data["JanTemp"].values
Z = data["Lat"].values 
X = np.stack([Y, Z], axis=1)
Y_hat = np.mean(Y)
Z_hat = np.mean(Z)
n = len(Y)
s_Y = np.sqrt(np.power(np.linalg.norm(Y - Y_hat), 2) / (n - 1))
s_Z = np.sqrt(np.power(np.linalg.norm(Z - Z_hat), 2) / (n - 1))
rho_hat = np.sum((Y - Y_hat) * (Z - Z_hat)) / ((n-1) * s_Y * s_Z)
print(f"Sample correlation: {rho_hat:.3f}")
Sample correlation: -0.848
In [11]:
def get_rho_hat(X):
    mu_hat = np.mean(X, axis=0)
    s_1 = np.sqrt(np.power(np.linalg.norm(X[:, 0] - mu_hat[0]), 2) / (n - 1))
    s_2 = np.sqrt(np.power(np.linalg.norm(X[:, 1] - mu_hat[1]), 2) / (n - 1))
    sample_cov = np.sum((X[:, 0] - mu_hat[0]) * (X[:, 1] - mu_hat[1]))
    return sample_cov / ((n - 1) * s_1 * s_2)

def get_bootstrap_ci(X):
    rho_hat = get_rho_hat(X)
    B = 1000
    T = np.empty(B)
    for i in range(B):
        X_star = X[np.random.choice(X.shape[0], size=n, replace=True)]
        T[i] = get_rho_hat(X_star)
    se_boot = np.sqrt(np.var(T))
    lb = rho_hat - z * se_boot
    ub = rho_hat + z * se_boot
    return se_boot, (lb, ub)

z = norm.ppf(1 - 0.05 / 2)
se_boot, ci_boot = get_bootstrap_ci(X)
print(f"Bootstrap 95% CI: ({ci_boot[0]:.3f}, {ci_boot[1]:.3f})")

def get_fishers_method_ci(X):
    rho_hat = get_rho_hat(X)
    theta_hat = (1 / 2) * (math.log(1 + rho_hat) - math.log(1 - rho_hat))
    se_theta_hat = 1 / math.sqrt(n - 3)
    a = theta_hat - z * se_theta_hat
    b = theta_hat + z * se_theta_hat
    lb = (math.exp(2 * a) - 1) / (math.exp(2 * a) + 1)
    ub = (math.exp(2 * b) - 1) / (math.exp(2 * b) + 1)
    return (lb, ub)

ci_fisher = get_fishers_method_ci(X)
print(f"Fisher's Method 95% CI: ({ci_fisher[0]:.3f}, {ci_fisher[1]:.3f})")
Bootstrap 95% CI: (-0.965, -0.732)
Fisher's Method 95% CI: (-0.908, -0.753)
In [12]:
# Wald test using the standard error from the bootstrap
W = rho_hat / se_boot
print("p-value: ", norm.cdf(-2 * abs(W)))
p-value:  2.2416470099624988e-179

We can conclude that the two variables are correlated, and, even without assuming normality of the variables, we can therefore also conclude they are dependent.

7¶

Test whether calcium intake and drop in blood pressure are associated. Use the data in

http://lib.stat.cmu.edu/DASL/Datafiles/Calcium.html

In [13]:
!curl -s --insecure https://www.stat.cmu.edu/~larry/all-of-statistics/=data/calcium.dat --output data/calcium.dat
data = pd.read_csv("data/calcium.dat",
                   skiprows=32,
                   sep="\t",
                   usecols=["Treatment", "Decrease"])
data
Out[13]:
Treatment Decrease
0 Calcium 7
1 Calcium -4
2 Calcium 18
3 Calcium 17
4 Calcium -3
5 Calcium -5
6 Calcium 1
7 Calcium 10
8 Calcium 11
9 Calcium -2
10 Placebo -1
11 Placebo 12
12 Placebo -1
13 Placebo -3
14 Placebo 3
15 Placebo -5
16 Placebo 5
17 Placebo 2
18 Placebo -11
19 Placebo -1
20 Placebo -3
In [14]:
Y = data['Treatment'].map({'Placebo': 1, 'Calcium': 2}).values
Z = data['Decrease'].values

n_1 = sum(Y == 1)
n_2 = sum(Y == 2)

def F_1(z):
    return (1 / n_1) * np.sum((Z < z) * (Y == 1))

def F_2(z):
    return (1 / n_2) * np.sum((Z < z) * (Y == 2))

zs = np.arange(-10, 20, .1)
plt.plot(zs, [F_1(z) for z in zs], label='F_1(z)')
plt.plot(zs, [F_2(z) for z in zs], label='F_2(z)')
plt.legend()
plt.title("CDFs")
plt.show()
In [15]:
D = max([abs(F_1(z) - F_2(z)) for z in zs])
test_statistic = math.sqrt((n_1 * n_2) / (n_1 + n_2)) * D
print(f"D: {D:.3f}")
print(f"Test statistic: {test_statistic:.3f}")


# approximating H(t) to a specified tolerance
def H(t, xtol=1e-8):
    j_max = int(np.ceil(np.sqrt(- np.log(xtol / 2) / (2 * t * t) )))
    jj = np.arange(1, j_max+1)
    return 1 + 2 * sum((-1) ** (jj) * np.exp(-2 * (jj ** 2) * t * t))

# computing H^{-1}(x)
@lru_cache(maxsize=None)
def H_inv(x):
    return fsolve(lambda t: H(t) - x, x0=1)[0]
D: 0.409
Test statistic: 0.936
In [16]:
T = np.logspace(-3, 0.25, 100)
plt.plot(T, list(map(H, T)))
plt.xlabel("t")
plt.ylabel("H(t)")
Out[16]:
Text(0, 0.5, 'H(t)')
In [17]:
X = np.linspace(0.01, 0.99, 99)
plt.plot(X, list(map(H_inv, X)))
plt.xlabel("$1-a$")
plt.ylabel("$H^{-1}(1-a)$")
plt.hlines(y=test_statistic, xmin=min(X), xmax=max(X), color='orange', label="Test statistic")
plt.title("$H^{-1}(1-a)$")
plt.legend()
plt.show()

Because the test statistic does not exceed the $H^{-1}(0.95)$, the $0.05$-sized test does not reject $H_0$.