🏛️ ISI Advanced Examination Practice

STB (Statistics B) 2025 — Model Solutions

Subject Level

Rigorous mathematical proofs and derivations for the Indian Statistical Institute Entrance Examination.

📌 Q1 Joint Markov Chains and Irreducibility (15 Marks)

Problem Statement: Let $\{X_n\}$ and $\{Y_n\}$ be two independent Markov chains on finite state spaces $S$ and $T$ with transition matrices $\mathbf{P} = ((p_{ij}))$ and $\mathbf{Q} = ((q_{ij}))$. Define $Z_n = (X_n, Y_n)$.
(a) Show that $\{Z_n\}$ is a Markov chain on $S \times T$ and write its transition matrix.
(b) If $\{X_n\}$ and $\{Y_n\}$ are irreducible, is $\{Z_n\}$ always irreducible? Justify.

🧠 Approach & Key Concepts

This tests the structural properties of Product Markov Chains.

✍️ Step-by-Step Proof / Derivation

Step 1: Proving the Markov Property for Part (a)

To show that $\{Z_n\}$ is a Markov chain, we must verify that its future state depends only on its present state, not the past. Let $z_k = (x_k, y_k)$ denote a generic state in $S \times T$. We evaluate the conditional probability:

$$ P(Z_{n+1} = z_{n+1} \mid Z_n = z_n, Z_{n-1} = z_{n-1}, \dots, Z_0 = z_0) $$

Substitute the definitions of $Z$ in terms of $X$ and $Y$:

$$ = P(X_{n+1}=x_{n+1}, Y_{n+1}=y_{n+1} \mid X_n=x_n, Y_n=y_n, \dots, X_0=x_0, Y_0=y_0) $$

Because the chains $\{X_n\}$ and $\{Y_n\}$ are strictly independent, their joint probability factors into the product of their individual probabilities:

$$ = P(X_{n+1}=x_{n+1} \mid X_n=x_n, \dots) \times P(Y_{n+1}=y_{n+1} \mid Y_n=y_n, \dots) $$

Since both $X$ and $Y$ are individually Markov chains, they depend only on their immediate previous states:

$$ = P(X_{n+1}=x_{n+1} \mid X_n=x_n) \times P(Y_{n+1}=y_{n+1} \mid Y_n=y_n) = p_{x_n, x_{n+1}} q_{y_n, y_{n+1}} $$

Because this result depends solely on $(x_n, y_n) = Z_n$, the process $\{Z_n\}$ is a Markov chain. Its transition probability matrix $\mathbf{R}$ is the Kronecker product of $\mathbf{P}$ and $\mathbf{Q}$ ($\mathbf{R} = \mathbf{P} \otimes \mathbf{Q}$), where $R_{(i,k),(j,l)} = p_{ij} q_{kl}$.


Step 2: Disproving Joint Irreducibility for Part (b)

The claim is false. We construct a simple counterexample based on periodicity. Let $S = \{1, 2\}$ and $T = \{1, 2\}$. Assume both chains are deterministic oscillators with the transition matrix:

$$ \mathbf{P} = \mathbf{Q} = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} $$

Both chains are clearly irreducible because you can reach state 2 from state 1, and state 1 from state 2. Now consider the joint chain $Z_n$ on the state space $\{(1,1), (1,2), (2,1), (2,2)\}$.

Suppose the process starts at $Z_0 = (1,1)$.

The joint chain will strictly oscillate between $(1,1)$ and $(2,2)$. The states $(1,2)$ and $(2,1)$ are mathematically impossible to reach from $(1,1)$. Because there exist states that cannot communicate with each other, the joint chain $\{Z_n\}$ is not irreducible.

Final Answer / Q.E.D:
(a) $\{Z_n\}$ satisfies the Markov property due to independence. The transition probability is $P_{(i,k),(j,l)} = p_{ij} q_{kl}$.
(b) No, $\{Z_n\}$ is not always irreducible. If the component chains are periodic, they can become phase-locked, making mixed states unreachable (as shown in the deterministic 2-state oscillator counterexample).

📌 Q2 Consistency of the Empirical CDF at the Sample Mean (15 Marks)

Problem Statement: Let $X_1, \dots, X_n \sim F$ i.i.d. with $\mathbb{E}(X_1) = \mu$. $F$ is continuous at $\mu$. The empirical CDF is $F_n(t) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}(X_i \leq t)$. For $\bar{X}_n = \frac{1}{n}\sum X_i$, show that $F_n(\bar{X}_n)$ is a consistent estimator of $F(\mu)$.

🧠 Approach & Key Concepts

This proof requires bridging two distinct forms of asymptotic convergence.

✍️ Step-by-Step Proof / Derivation

Step 1: Setting up the Triangle Inequality

To prove that $F_n(\bar{X}_n)$ is a consistent estimator of $F(\mu)$, we must show that $F_n(\bar{X}_n) \xrightarrow{p} F(\mu)$, which means the absolute difference converges to zero in probability. We bound the error using the triangle inequality by introducing the intermediate term $F(\bar{X}_n)$:

$$ |F_n(\bar{X}_n) - F(\mu)| \leq |F_n(\bar{X}_n) - F(\bar{X}_n)| + |F(\bar{X}_n) - F(\mu)| $$

We will show that both terms on the right-hand side converge to 0 in probability.

Step 2: Bounding the First Term via Glivenko-Cantelli

The term $|F_n(\bar{X}_n) - F(\bar{X}_n)|$ represents the error between the empirical CDF and the true CDF evaluated at a specific random point $\bar{X}_n$. This error is strictly bounded by the maximum error across the entire real line:

$$ |F_n(\bar{X}_n) - F(\bar{X}_n)| \leq \sup_{t \in \mathbb{R}} |F_n(t) - F(t)| $$

By the Glivenko-Cantelli Theorem (the Fundamental Theorem of Statistics), the empirical CDF converges uniformly to the true CDF almost surely (and therefore in probability):

$$ \sup_{t \in \mathbb{R}} |F_n(t) - F(t)| \xrightarrow{p} 0 $$

Consequently, $|F_n(\bar{X}_n) - F(\bar{X}_n)| \xrightarrow{p} 0$.

Step 3: Bounding the Second Term via WLLN and Continuity

By the Weak Law of Large Numbers (WLLN), since $\mathbb{E}[X_1] = \mu$ exists and is finite, the sample mean converges in probability to the population mean:

$$ \bar{X}_n \xrightarrow{p} \mu $$

We are explicitly given that the function $F$ is continuous at the point $\mu$. By the Continuous Mapping Theorem, continuous functions preserve convergence in probability. Therefore:

$$ F(\bar{X}_n) \xrightarrow{p} F(\mu) $$

This implies that $|F(\bar{X}_n) - F(\mu)| \xrightarrow{p} 0$.

Step 4: Conclusion

Since both components of our triangle inequality bound converge to $0$ in probability, their sum also converges to $0$ in probability:

$$ |F_n(\bar{X}_n) - F(\mu)| \xrightarrow{p} 0 $$
Final Answer / Q.E.D: By decoupling the statistical estimation error (Glivenko-Cantelli) and the parameter estimation error (WLLN + Continuous Mapping Theorem), we strictly prove that $F_n(\bar{X}_n) \xrightarrow{p} F(\mu)$, satisfying the definition of a consistent estimator.

📌 Q3 Variances and Correlations of Dirichlet-like Ratios (15 Marks)

Problem Statement: Let $X_1, X_2, X_3$ be i.i.d positive random variables. Define $U_i = X_i / (X_1+X_2+X_3)$.
(a) Give a choice of $(a_1, a_2, a_3)$ so $\text{Var}(a_1 U_1 + a_2 U_2 + a_3 U_3) = 0$.
(b) Find the correlation matrix of $(U_1, U_2, U_3)$.
(c) Find $a_1, a_2, a_3$ satisfying $\sum a_i^2 = 1$ that maximizes $\text{Var}(\sum a_i U_i)$.

🧠 Approach & Key Concepts

This problem explores variables normalized to sum to $1$ (a simplex constraint).

✍️ Step-by-Step Proof / Derivation

Step 1: Finding the Zero-Variance Vector for Part (a)

By definition, $U_1 + U_2 + U_3 = \frac{X_1 + X_2 + X_3}{X_1 + X_2 + X_3} = 1$.
Because the sum is a deterministic constant, its variance is exactly $0$. Therefore, if we choose weights that simply sum the variables, the variance vanishes:

$$ a_1 = 1, \quad a_2 = 1, \quad a_3 = 1 $$

(Any constant vector $c(1, 1, 1)$ works).

Step 2: Deriving the Correlation Matrix for Part (b)

Because $X_1, X_2, X_3$ are i.i.d, the variables $U_1, U_2, U_3$ are perfectly exchangeable. This means they share the exact same variance $v = \text{Var}(U_i)$ and the exact same covariance $c = \text{Cov}(U_i, U_j)$ for $i \neq j$.

We use the sum constraint to solve for $c$ in terms of $v$. Since $\text{Var}(U_1 + U_2 + U_3) = \text{Var}(1) = 0$:

$$ \text{Var}(U_1 + U_2 + U_3) = 3v + 6c = 0 \implies 6c = -3v \implies c = -\frac{1}{2}v $$

The correlation coefficient $\rho$ between any pair is the covariance divided by the variance:

$$ \rho = \frac{\text{Cov}(U_i, U_j)}{\sqrt{\text{Var}(U_i)\text{Var}(U_j)}} = \frac{-\frac{1}{2}v}{v} = -\frac{1}{2} $$

Thus, the correlation matrix $\mathbf{R}$ is:

$$ \mathbf{R} = \begin{pmatrix} 1 & -1/2 & -1/2 \\ -1/2 & 1 & -1/2 \\ -1/2 & -1/2 & 1 \end{pmatrix} $$

Step 3: Maximizing the Variance via Eigenvalues for Part (c)

We want to maximize $\text{Var}(\mathbf{a}^\top \mathbf{U}) = \mathbf{a}^\top \mathbf{\Sigma} \mathbf{a}$ subject to $\mathbf{a}^\top \mathbf{a} = 1$. The covariance matrix is $\mathbf{\Sigma} = v \mathbf{R}$. By the Rayleigh Quotient theorem, the maximum variance is exactly the largest eigenvalue of $\mathbf{\Sigma}$, and the optimal vector $\mathbf{a}$ is the corresponding eigenvector.

Let's find the eigenvalues of $\mathbf{R}$. We can write $\mathbf{R}$ as a linear combination of the Identity matrix $\mathbf{I}$ and the all-ones matrix $\mathbf{J}$:

$$ \mathbf{R} = \frac{3}{2}\mathbf{I} - \frac{1}{2}\mathbf{J} $$

The eigenvalues of $\mathbf{J}$ are $3$ (multiplicity 1, eigenvector $\mathbf{1}$) and $0$ (multiplicity 2, orthogonal to $\mathbf{1}$). Mapping these through our equation for $\mathbf{R}$:

The maximum eigenvalue of $\mathbf{\Sigma}$ is $\frac{3}{2}v$. This eigenvalue has a multiplicity of 2, corresponding to any eigenvector $\mathbf{a}$ that is orthogonal to the ones-vector $(1,1,1)^\top$. This requires $a_1 + a_2 + a_3 = 0$.

Combining this with the constraint $a_1^2 + a_2^2 + a_3^2 = 1$, any vector satisfying both conditions is an optimal solution. An easy choice is to set $a_3 = 0$:

$$ a_1 + a_2 = 0 \implies a_1 = -a_2 \quad \text{and} \quad 2a_1^2 = 1 \implies a_1 = \frac{1}{\sqrt{2}} $$
Final Answer / Q.E.D:
(a) The choice is $a_1 = 1, a_2 = 1, a_3 = 1$.
(b) The correlation matrix has 1 on the diagonal and $-1/2$ on all off-diagonals.
(c) To maximize the variance, choose any normalized vector orthogonal to $(1,1,1)$. A valid choice is $a_1 = \frac{1}{\sqrt{2}}, a_2 = -\frac{1}{\sqrt{2}}, a_3 = 0$.

📌 Q4 Quadratic Forms of Standard Normal Vectors (15 Marks)

Problem Statement: Let $\mathbf{X} = (X_1,\dots,X_4)^\top \sim N_4(\mathbf{0}, \mathbf{I}_4)$. Define:
$Q_1 = \frac{1}{3} (3X_1^2 + X_2^2 + X_3^2 + X_4^2 + 2X_2X_3 + 2X_2X_4 + 2X_3X_4)$
$Q_2 = \frac{1}{3} (2X_2^2 + 2X_3^2 + 2X_4^2 - 2X_2X_3 - 2X_2X_4 - 2X_3X_4)$
(a) Find the distributions of $Q_1$ and $Q_2$.
(b) Show that $Q_1$ and $Q_2$ are independent.

🧠 Approach & Key Concepts

This problem evaluates distributions of quadratic forms using matrix algebra.

✍️ Step-by-Step Proof / Derivation

Step 1: Matrix Representation of $Q_1$ and $Q_2$

We can rewrite the polynomials by separating the variables into $X_1$ and a sub-vector $\mathbf{X}^* = (X_2, X_3, X_4)^\top$. Notice the cross-terms in the $X_2, X_3, X_4$ variables.

For $Q_1$:

$$ Q_1 = X_1^2 + \frac{1}{3}(X_2 + X_3 + X_4)^2 = X_1^2 + (\mathbf{X}^*)^\top \left(\frac{1}{3} \mathbf{J}_3\right) \mathbf{X}^* $$

where $\mathbf{J}_3$ is the $3 \times 3$ matrix of all ones. The total matrix representation $Q_1 = \mathbf{X}^\top \mathbf{A}_1 \mathbf{X}$ uses a block diagonal matrix:

$$ \mathbf{A}_1 = \begin{pmatrix} 1 & \mathbf{0}^\top \\ \mathbf{0} & \frac{1}{3}\mathbf{J}_3 \end{pmatrix} $$

For $Q_2$, observe that it perfectly complements the second part of $Q_1$. Specifically, $X_2^2 + X_3^2 + X_4^2 = (\mathbf{X}^*)^\top \mathbf{I}_3 \mathbf{X}^*$. If we subtract $\frac{1}{3}(X_2+X_3+X_4)^2$ from this sum of squares, we get exactly $Q_2$. Thus:

$$ Q_2 = (\mathbf{X}^*)^\top \left(\mathbf{I}_3 - \frac{1}{3} \mathbf{J}_3\right) \mathbf{X}^* $$

The total matrix representation $Q_2 = \mathbf{X}^\top \mathbf{A}_2 \mathbf{X}$ is:

$$ \mathbf{A}_2 = \begin{pmatrix} 0 & \mathbf{0}^\top \\ \mathbf{0} & \mathbf{I}_3 - \frac{1}{3}\mathbf{J}_3 \end{pmatrix} $$

Step 2: Determining Distributions for Part (a)

To prove these follow $\chi^2$ distributions, we check if $\mathbf{A}_1$ and $\mathbf{A}_2$ are idempotent. A matrix is idempotent if $\mathbf{A}^2 = \mathbf{A}$.


Step 3: Proving Independence via Craig's Theorem for Part (b)

By Craig's Theorem, $Q_1$ and $Q_2$ are independent if $\mathbf{A}_1 \mathbf{A}_2 = \mathbf{0}$. We multiply the block matrices:

$$ \mathbf{A}_1 \mathbf{A}_2 = \begin{pmatrix} 1 & \mathbf{0}^\top \\ \mathbf{0} & \frac{1}{3}\mathbf{J}_3 \end{pmatrix} \begin{pmatrix} 0 & \mathbf{0}^\top \\ \mathbf{0} & \mathbf{I}_3 - \frac{1}{3}\mathbf{J}_3 \end{pmatrix} $$
$$ \mathbf{A}_1 \mathbf{A}_2 = \begin{pmatrix} (1)(0) & \mathbf{0}^\top \\ \mathbf{0} & \left(\frac{1}{3}\mathbf{J}_3\right)\left(\mathbf{I}_3 - \frac{1}{3}\mathbf{J}_3\right) \end{pmatrix} $$

Evaluate the lower right block:

$$ \frac{1}{3}\mathbf{J}_3 - \left(\frac{1}{3}\mathbf{J}_3\right)\left(\frac{1}{3}\mathbf{J}_3\right) = \frac{1}{3}\mathbf{J}_3 - \frac{1}{3}\mathbf{J}_3 = \mathbf{0} $$

Since the product matrix is completely $\mathbf{0}$, the quadratic forms are strictly independent.

Final Answer / Q.E.D:
(a) Both $Q_1$ and $Q_2$ follow a Chi-Square distribution with $2$ degrees of freedom ($\chi^2_2$).
(b) Because the product of their symmetric idempotent matrices evaluates to the null matrix ($\mathbf{A}_1 \mathbf{A}_2 = \mathbf{0}$), Craig's Theorem confirms that $Q_1$ and $Q_2$ are strictly independent.

📌 Q5 Asymptotic Efficiency via Delta Method (15 Marks)

Problem Statement: Estimate $\theta = p^2$ using two strategies requiring $2n$ coin flips:
(S1) Flip $2n$ times. $U_n = (S_{2n} / 2n)^2$.
(S2) Flip pairs $n$ times. $Y_i = 1$ if HH. $V_n = \frac{1}{n}\sum Y_i$.
(a) Show both are consistent.
(b) Find asymptotic distributions of $\sqrt{n}(U_n - \theta)$ and $\sqrt{n}(V_n - \theta)$.
(c) Which is preferred? Justify.

🧠 Approach & Key Concepts

This evaluates asymptotic estimators using the Central Limit Theorem (CLT) and the Delta Method.

✍️ Step-by-Step Proof / Derivation

Step 1: Proving Consistency for Part (a)

For $S1$: Let $\hat{p} = S_{2n}/2n$. By the WLLN, $\hat{p} \xrightarrow{p} p$. The function $g(x) = x^2$ is continuous. By the Continuous Mapping Theorem, $U_n = \hat{p}^2 \xrightarrow{p} p^2 = \theta$.

For $S2$: Each $Y_i \sim \text{Bernoulli}(p^2)$ since the probability of HH is $p \times p = p^2$. By the WLLN, the sample average $V_n = \bar{Y}_n \xrightarrow{p} \mathbb{E}[Y_1] = p^2 = \theta$. Both are consistent.

Step 2: Deriving Asymptotic Distributions for Part (b)

For $V_n$: $V_n$ is the mean of $n$ i.i.d Bernoulli random variables with success parameter $p^2$. The variance of each observation is $p^2(1-p^2)$. By the standard Central Limit Theorem:

$$ \sqrt{n}(V_n - p^2) \xrightarrow{d} N\big(0, p^2(1-p^2)\big) $$

For $U_n$: First, we define the CLT for the raw sample proportion over $2n$ trials:

$$ \sqrt{2n}(\hat{p} - p) \xrightarrow{d} N\big(0, p(1-p)\big) $$

To match the scale requested by the problem ($\sqrt{n}$ instead of $\sqrt{2n}$), we divide the variance by 2:

$$ \sqrt{n}(\hat{p} - p) \xrightarrow{d} N\left(0, \frac{p(1-p)}{2}\right) $$

Now, apply the Delta Method for the transformation $g(x) = x^2$, where $g'(x) = 2x$. The asymptotic variance is scaled by $[g'(p)]^2 = 4p^2$:

$$ \sqrt{n}(U_n - p^2) \xrightarrow{d} N\left(0, (4p^2) \frac{p(1-p)}{2}\right) \equiv N\big(0, 2p^3(1-p)\big) $$

Step 3: Comparing Efficiencies for Part (c)

To determine the preferred estimator, we evaluate their asymptotic variances. We want to check if $\text{Var}_{asy}(U_n) < \text{Var}_{asy}(V_n)$:

$$ 2p^3(1-p) < p^2(1-p^2) $$

Factor the right side using the difference of squares: $1-p^2 = (1-p)(1+p)$. Since $p \in (0,1)$, we can divide both sides by the positive quantity $p^2(1-p)$:

$$ 2p < 1 + p \implies p < 1 $$

Because $p$ is a true probability strictly less than 1, this inequality is always true. Thus, $U_n$ has a strictly smaller asymptotic variance.

Final Answer / Q.E.D:
(a) Both estimators converge in probability to $p^2$, rendering them consistent.
(b) $\sqrt{n}(U_n - \theta) \xrightarrow{d} N(0, 2p^3(1-p))$ and $\sqrt{n}(V_n - \theta) \xrightarrow{d} N(0, p^2(1-p^2))$.
(c) $U_n$ (Strategy S1) is unequivocally preferred. Both strategies utilize the same total sample size ($2n$ flips), but $U_n$ extracts information from the marginal counts rather than just joint pairs, resulting in a strictly smaller asymptotic variance.

📌 Q6 Exact Conditional Tests for Incomplete Bivariate Normal Data (15 Marks)

Problem Statement: $M$ observations are generated from $N_2(\mu, \mu, \sigma_1^2, \sigma_2^2, \rho)$. If we only know the numbers of observations in the first and third quadrant ($M_1$ and $M_3$), construct an exact test for $H_0: \mu = 0$ against $H_1: \mu > 0$. Justify your answer.

🧠 Approach & Key Concepts

This is a missing-data hypothesis testing problem that pivots to Non-Parametric structures. Because we do not have the actual numerical coordinates or the counts for the 2nd and 4th quadrants, continuous likelihood ratio tests are impossible. We must use a Conditional Exact Binomial Test (similar to McNemar's logic). By conditioning on the sum of the known counts ($M_1 + M_3 = k$), we isolate a test statistic whose distribution under the null hypothesis is completely free of nuisance parameters ($\sigma_1, \sigma_2, \rho$).

✍️ Step-by-Step Proof / Derivation

Step 1: Identifying Probabilities under $H_0$ and $H_1$

Let $p_1 = P(X > 0, Y > 0)$ be the probability of landing in the 1st quadrant, and $p_3 = P(X < 0, Y < 0)$ be the probability of landing in the 3rd quadrant.

Under the null hypothesis $H_0 : \mu = 0$, the bivariate normal distribution is centered exactly at the origin $(0,0)$. The distribution of $(X,Y)$ is symmetric about the origin, meaning $(-X, -Y)$ has the exact same distribution as $(X,Y)$. Consequently, the probability mass in the 1st quadrant perfectly equals the 3rd quadrant:

$$ p_1 = p_3 \quad (\text{under } H_0) $$

Under the alternative hypothesis $H_1 : \mu > 0$, the entire distribution shifts to the upper-right. This strictly increases the mass in the 1st quadrant and decreases the mass in the 3rd quadrant:

$$ p_1 > p_3 \quad (\text{under } H_1) $$

Step 2: Constructing the Conditional Statistic

We only observe $M_1$ and $M_3$. Because the total number of points falling into these two quadrants, $k = M_1 + M_3$, is a random variable dependent on the unknown nuisance parameters (like the covariance $\rho$), we condition on it to build an exact test.

Given that a point falls into either the 1st or 3rd quadrant, the conditional probability that it falls into the 1st quadrant is:

$$ \pi = P(\text{1st Quadrant} \mid \text{1st or 3rd Quadrant}) = \frac{p_1}{p_1 + p_3} $$

Step 3: Defining the Exact Test

Conditional on $M_1 + M_3 = k$, the count $M_1$ follows a Binomial distribution. We test $H_0^* : \pi = 0.5$ against $H_1^* : \pi > 0.5$.

$$ M_1 \mid (M_1 + M_3 = k) \sim \text{Binomial}\left(k, \frac{1}{2}\right) $$

Because the alternative hypothesis sets $\pi > 0.5$, large values of $M_1$ provide evidence against the null hypothesis. The exact critical region of level $\alpha$ is to reject $H_0$ if $M_1 \geq c$, where the critical value $c$ is the smallest integer satisfying:

$$ \sum_{j=c}^{k} \binom{k}{j} \left(\frac{1}{2}\right)^k \leq \alpha $$
Final Answer / Q.E.D: Because the exact numerical coordinates and other quadrant counts are unknown, we condition on the sum $M_1 + M_3 = k$. Under $H_0$, symmetry dictates $M_1 \sim \text{Bin}(k, 1/2)$. The exact test rejects $H_0$ in favor of $H_1$ if $M_1 \geq c$, calculated directly from the standard Binomial CDF.

📌 Q7 Probability Bounds on Order Statistics (15 Marks)

Problem Statement: Let $X_1, X_2, X_3$ be i.i.d continuous random variables with a strictly increasing CDF $F$. Define $\psi(x) = P(\min X_i \leq x \leq \max X_i)$. Show that $\psi(x)$ is maximum when $x$ is the median of the distribution.

🧠 Approach & Key Concepts

This problem analyzes the spread of a random sample across a threshold. Because the distribution is strictly continuous, we can use the Probability Integral Transform $U_i = F(X_i)$ to map the problem into standard Uniform $(0,1)$ space. We then calculate the probability using the complement rule (the event fails only if all variables are strictly below $x$, or all are strictly above $x$) to create a simple optimization function.

✍️ Step-by-Step Proof / Derivation

Step 1: Applying the Complement Rule

The event $\{\min X_i \leq x \leq \max X_i\}$ implies that the threshold $x$ is bracketed by the sample points. This event fails to happen if and only if one of two mutually exclusive events occurs:

  1. All three variables are strictly less than $x$: $\max X_i < x$.
  2. All three variables are strictly greater than $x$: $\min X_i > x$.

Therefore, we can write $\psi(x)$ using the complement rule:

$$ \psi(x) = 1 - \big[ P(\max X_i < x) + P(\min X_i > x) \big] $$

Step 2: Evaluating the Component Probabilities

Because the $X_i$ are independent and identically distributed:

$$ P(\max X_i < x) = P(X_1 < x, X_2 < x, X_3 < x) = P(X_1 < x)^3 = [F(x)]^3 $$
$$ P(\min X_i > x) = P(X_1 > x, X_2 > x, X_3 > x) = P(X_1 > x)^3 = [1 - F(x)]^3 $$

Substitute these back into the function. To simplify, let $u = F(x)$. Since $F$ is a CDF, $u \in (0, 1)$.

$$ h(u) = 1 - \big( u^3 + (1-u)^3 \big) $$

Step 3: Algebraic Optimization

We expand the binomial term $(1-u)^3$ to simplify the function $h(u)$:

$$ h(u) = 1 - \big[ u^3 + (1 - 3u + 3u^2 - u^3) \big] $$
$$ h(u) = 1 - (1 - 3u + 3u^2) = 3u - 3u^2 = 3u(1-u) $$

The function $h(u) = -3u^2 + 3u$ is a downward-opening parabola. To find its maximum, we take the derivative and set it to zero:

$$ h'(u) = -6u + 3 = 0 \implies u = \frac{1}{2} $$

Because the second derivative is negative ($-6$), this is indeed a global maximum on the interval $(0,1)$.

Step 4: Mapping Back to the Original Space

The maximum occurs when $u = 1/2$. Recalling our substitution $u = F(x)$:

$$ F(x) = \frac{1}{2} $$

By the fundamental definition of quantiles, the point $x$ where the cumulative distribution function equals $0.5$ is strictly the median of the distribution. Because $F$ is strictly increasing, this point is unique.

Final Answer / Q.E.D: By expressing the probability via its complement, we derived the quadratic function $3u(1-u)$ which is universally maximized at $u = 0.5$. This proves $\psi(x)$ is maximum exactly when $x$ is the median of the distribution.

📌 Q8 Design of Experiments and Variance Optimization (15 Marks)

Problem Statement: An experiment compares 5 drugs using $n$ patients. $y_{ij} = \mu + \tau_i + \epsilon_{ij}$ with homoscedastic errors. Let $n_i$ be the allocation to drug $i$. Minimize the average variance of the BLUEs of treatment contrasts $\tau_i - \tau_j$.
(a) If $n=35$, find the optimal allocations.
(b) If $n=36$, find the optimal allocations.
(c) If $n=36$ and drug 1 is a control, minimize the average variance of $\tau_1 - \tau_j$ (for $j=2,3,4,5$) only. Find allocations.

🧠 Approach & Key Concepts

This problem navigates constrained discrete optimization via Cauchy-Schwarz / AM-HM Inequalities and Lagrange Multipliers.

✍️ Step-by-Step Proof / Derivation

Step 1: Objective Function for (a) and (b)

We want to minimize the average variance of all $\binom{5}{2} = 10$ contrasts. The sum of the variances is:

$$ \sum_{i < j} \text{Var}(\hat{\tau}_i - \hat{\tau}_j) = \sum_{i < j} \sigma^2 \left( \frac{1}{n_i} + \frac{1}{n_j} \right) = \sigma^2 \frac{1}{2} \sum_{i \neq j} \left( \frac{1}{n_i} + \frac{1}{n_j} \right) = \sigma^2 \frac{1}{2} \sum_{i=1}^5 \left( \frac{4}{n_i} + \sum_{j \neq i} \frac{1}{n_j} \right) $$

Simplifying the summation counts shows that each $1/n_i$ term appears exactly 4 times (since each drug is compared against 4 others). Thus, the objective function to minimize is directly proportional to:

$$ f(\mathbf{n}) = \sum_{i=1}^5 \frac{1}{n_i} \quad \text{subject to } \sum_{i=1}^5 n_i = n $$

Step 2: Solving Part (a) with $n=35$

By the AM-HM inequality (or Cauchy-Schwarz), the sum of reciprocals is globally minimized when all variables are exactly equal. Since $35$ is perfectly divisible by $5$, we can achieve perfect equality:

$$ n_1 = n_2 = n_3 = n_4 = n_5 = \frac{35}{5} = 7 $$

Step 3: Solving Part (b) with $n=36$

We cannot divide $36$ into 5 equal integers. We must make the allocations as equal as possible to minimize the convexity penalty of $1/x$. Since $36 = (4 \times 7) + 8$, the optimal near-uniform integer allocation is to assign 8 patients to one drug and 7 to the rest.

$$ \{n_1, n_2, n_3, n_4, n_5\} = \{8, 7, 7, 7, 7\} \quad \text{(in any permutation)} $$

Step 4: Formulating the Control Objective for Part (c)

We now only care about the 4 contrasts involving the control drug ($i=1$). The sum of variances is:

$$ \sum_{j=2}^5 \text{Var}(\hat{\tau}_1 - \hat{\tau}_j) = \sigma^2 \sum_{j=2}^5 \left( \frac{1}{n_1} + \frac{1}{n_j} \right) = \sigma^2 \left( \frac{4}{n_1} + \sum_{j=2}^5 \frac{1}{n_j} \right) $$

To minimize this, by symmetry amongst the test drugs, we must allocate equally to the 4 test drugs. Let $n_j = k$ for $j=2,3,4,5$. Thus, the constraint is $n_1 + 4k = 36$. Our continuous objective function is:

$$ g(n_1, k) = \frac{4}{n_1} + \frac{4}{k} $$

Step 5: Lagrange Multiplier Optimization

We minimize $g(n_1, k)$ subject to $n_1 + 4k = 36$. Taking partial derivatives and equating ratios (or substituting $n_1 = 36 - 4k$):

$$ \frac{\partial}{\partial n_1} \implies -\frac{4}{n_1^2} = \lambda \quad \text{and} \quad \frac{\partial}{\partial k} \implies -\frac{4}{k^2} = 4\lambda $$

Dividing the equations eliminates $\lambda$:

$$ \frac{-4/k^2}{-4/n_1^2} = 4 \implies \frac{n_1^2}{k^2} = 4 \implies n_1 = 2k $$

This reveals the optimal allocation rule: the control group should be exactly twice the size of a test group. Substituting this back into the constraint:

$$ 2k + 4k = 36 \implies 6k = 36 \implies k = 6 $$

Therefore, $n_1 = 2(6) = 12$. Since these are exact integers, no rounding is required.

Final Answer / Q.E.D:
(a) For $n=35$, the optimal allocations are $n_i = 7$ for all $i=1, 2, \dots, 5$.
(b) For $n=36$, the optimal allocations are four groups of $7$ and one group of $8$.
(c) To minimize control-test variance, the control group should be twice the size of test groups. Thus, $n_1 = 12$ (Control) and $n_2 = n_3 = n_4 = n_5 = 6$ (Test Drugs).

📚 Paper Summary & Key Focus Areas

🎯 Core Concepts Tested in This Paper

💡 ISI Examiner Insight:
In the STB paper, examiners look for the ability to bridge different theoretical domains seamlessly.
1. In Q4, you must transition from algebraic polynomial expansion into rigid Matrix Linear Algebra to utilize Craig's theorem and idempotent ranks.
2. In Q6, realizing that the absence of exact continuous coordinates renders standard likelihood ratios useless immediately pivots the approach to a non-parametric Conditional Exact Test.
3. In Q8(c), proving $n_1 = 2k$ mathematically via derivatives (or Lagrange) is strictly required to get full marks; guessing the $12$ and $6$ split without derivation will lose points.