🏛️ ISI Advanced Examination Practice

STA (Statistics A) 2023 — Model Solutions

Subject Level

Rigorous mathematical proofs and derivations for the Indian Statistical Institute Entrance Examination.

📌 Q1 Convergence of Transformed Infinite Series (10 Marks)

Problem Statement: Consider $a_n > 0$, for $n = 1, 2, \dots$, such that $\sum_{n=1}^\infty a_n = \infty$. Check whether $\sum_{n=1}^\infty \frac{a_n}{1 + n a_n}$ is convergent if $\liminf_{n \to \infty} n a_n > 0$.

🧠 Approach & Key Concepts

This problem tests the application of the Direct Comparison Test for infinite series. The condition $\liminf n a_n > 0$ provides a strict lower bound on the growth rate of $a_n$ relative to $1/n$ for sufficiently large $n$. We can use this lower bound, combined with the algebraic monotonicity of the transformation function $f(x) = \frac{x}{1+nx}$, to strictly bound the target series below by the harmonic series, which is known to diverge.

✍️ Step-by-Step Proof / Derivation

Step 1: Translating the Limit Inferior Condition

We are given that $\liminf_{n \to \infty} n a_n = L > 0$. By the formal definition of the limit inferior, for any $\epsilon > 0$ (specifically choosing $\epsilon = L/2$), there exists an integer $N$ such that for all $n \geq N$:

$$n a_n > L - \frac{L}{2} = \frac{L}{2}$$

Dividing both sides by $n$, we obtain a strict lower bound for the terms of the sequence:

$$a_n > \frac{L}{2n} \quad \text{for all } n \geq N$$

Step 2: Analyzing the Monotonicity of the Transformation

Consider the function $f(x) = \frac{x}{1+nx}$ for $x > 0$. Let us determine if it is increasing or decreasing by finding its derivative:

$$f'(x) = \frac{(1)(1+nx) - x(n)}{(1+nx)^2} = \frac{1}{(1+nx)^2}$$

Since $f'(x) > 0$ for all $x > 0$, the function $f(x)$ is strictly increasing on $(0, \infty)$. Therefore, if $x_1 > x_2$, then $f(x_1) > f(x_2)$.

Step 3: Bounding the Target Series

Since $a_n > \frac{L}{2n}$ for all $n \geq N$, and the function $f(x) = \frac{x}{1+nx}$ is strictly increasing, we can apply the function to both sides of the inequality:

$$\frac{a_n}{1 + n a_n} > \frac{\frac{L}{2n}}{1 + n\left(\frac{L}{2n}\right)}$$

Simplify the right side of the inequality:

$$\frac{a_n}{1 + n a_n} > \frac{\frac{L}{2n}}{1 + \frac{L}{2}} = \frac{L}{2n(1 + L/2)} = \left( \frac{L}{2 + L} \right) \frac{1}{n}$$

Step 4: Applying the Comparison Test

We have established that for all $n \geq N$:

$$\frac{a_n}{1 + n a_n} > K \cdot \frac{1}{n}$$

where $K = \frac{L}{2+L}$ is a strictly positive constant. We know that the harmonic series $\sum_{n=1}^\infty \frac{1}{n}$ strictly diverges. By the Direct Comparison Test, since our series is bounded below by a positive constant multiple of a divergent series, our series must also diverge.

Final Answer / Q.E.D: Because the terms of the series are strictly bounded below by a multiple of the divergent harmonic series $\frac{1}{n}$ for all sufficiently large $n$, the series $\sum_{n=1}^\infty \frac{a_n}{1 + n a_n}$ is divergent.

📌 Q2 Conditional Probability with Order Statistics (10 Marks)

Problem Statement: Let $X_1$ and $X_2$ be independent and identically distributed random variables with pdf $f(x) = nx^{n-1}$, for $0 < x < 1$, for fixed $n > 1$. Show that $P(X_1 < X_2 \mid X_1 < nX_2) = \frac{1}{2 - 1/n^n}$.

🧠 Approach & Key Concepts

This problem evaluates joint probabilities over restricted 2D domains. We use the fundamental conditional probability formula: $P(A \mid B) = \frac{P(A \cap B)}{P(B)}$.

✍️ Step-by-Step Proof / Derivation

Step 1: Setting up the Conditional Probability

We need to find $P(X_1 < X_2 \mid X_1 < nX_2)$. By definition:

$$P(X_1 < X_2 \mid X_1 < nX_2) = \frac{P(X_1 < X_2 \text{ and } X_1 < nX_2)}{P(X_1 < nX_2)}$$

Since $n > 1$ and $X_2$ is strictly positive (domain $(0,1)$), it is mathematically guaranteed that if $X_1 < X_2$, then $X_1$ is also strictly less than $nX_2$. Therefore, the intersection of the two events is simply the tighter condition:

$$P(X_1 < X_2 \text{ and } X_1 < nX_2) = P(X_1 < X_2)$$

Step 2: Calculating the Numerator $P(X_1 < X_2)$

Because $X_1$ and $X_2$ are independent and identically distributed (i.i.d.) continuous random variables, the probability that one is strictly less than the other is symmetric and identical:

$$P(X_1 < X_2) = \frac{1}{2}$$

Step 3: Calculating the Denominator $P(X_1 < nX_2)$

The joint probability density function is the product of their individual pdfs (since they are independent):

$$f_{X_1, X_2}(x_1, x_2) = (nx_1^{n-1})(nx_2^{n-1}) = n^2 x_1^{n-1} x_2^{n-1}$$

We must integrate this over the region $0 < x_1 < 1$, $0 < x_2 < 1$, subject to the condition $x_1 < nx_2$. This constraint means $x_2$ must be greater than $x_1/n$. Setting up the bounds with $dx_2$ on the inner integral:

$$P(X_1 < nX_2) = \int_0^1 \left( \int_{x_1/n}^1 n^2 x_1^{n-1} x_2^{n-1} dx_2 \right) dx_1$$

Evaluate the inner integral with respect to $x_2$:

$$\int_{x_1/n}^1 n^2 x_1^{n-1} x_2^{n-1} dx_2 = n x_1^{n-1} \left[ x_2^n \right]_{x_1/n}^1 = n x_1^{n-1} \left( 1 - \left(\frac{x_1}{n}\right)^n \right)$$

Now, evaluate the outer integral with respect to $x_1$:

$$\int_0^1 n x_1^{n-1} \left( 1 - \frac{x_1^n}{n^n} \right) dx_1 = \int_0^1 n x_1^{n-1} dx_1 - \int_0^1 \frac{n}{n^n} x_1^{2n-1} dx_1$$

Computing both definite integrals:

$$= \left[ x_1^n \right]_0^1 - \frac{n}{n^n} \left[ \frac{x_1^{2n}}{2n} \right]_0^1 = 1 - \left( \frac{n}{n^n} \cdot \frac{1}{2n} \right) = 1 - \frac{1}{2n^n}$$

Step 4: Final Substitution

Substitute the numerator and denominator back into the conditional probability equation:

$$P(X_1 < X_2 \mid X_1 < nX_2) = \frac{1/2}{1 - \frac{1}{2n^n}}$$

Multiply the top and bottom by 2 to clear the fraction in the numerator:

$$= \frac{1}{2 - \frac{2}{2n^n}} = \frac{1}{2 - \frac{1}{n^n}}$$
Final Answer / Q.E.D: By rigorously evaluating the joint density bounds and leveraging i.i.d. symmetry, we have shown that $P(X_1 < X_2 \mid X_1 < nX_2)$ strictly evaluates to $\frac{1}{2 - 1/n^n}$.

📌 Q3 Markov Chain Classification and Transience (10 Marks)

Problem Statement: Define the matrix $P$:
$P = \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & a & b \\ 0 & 1 & 0 & 0 \\ c & d & 0 & 0 \end{bmatrix}$
(a) For what values of $(a, b, c, d)$ is $P$ the transition probability matrix (TPM) of a 4-state Markov Chain?
(b) For different values, find all disjoint, closed, irreducible, recurrent classes.

🧠 Approach & Key Concepts

This problem analyzes State Communication Classes within discrete-time Markov chains.

✍️ Step-by-Step Proof / Derivation

Step 1: Establishing the Valid TPM conditions for Part (a)

For $P$ to be a valid Transition Probability Matrix, all elements must be $\geq 0$, and the sum of probabilities across each row must equal 1.

Therefore, $P$ is a valid TPM for any $a, b, c, d$ such that $a, c \in [0, 1]$ and $b = 1-a$, $d = 1-c$.


Step 2: Mapping the Topology for Part (b)

Let's map the guaranteed transitions and the conditional transitions:
$1 \to 4$ (w.p. 1)
$3 \to 2$ (w.p. 1)
$2 \to 3$ (w.p. $a$) and $2 \to 4$ (w.p. $1-a$)
$4 \to 1$ (w.p. $c$) and $4 \to 2$ (w.p. $1-c$)

We analyze the communicating classes by testing the extreme binary switches of $a$ and $c$.

Case 1: $c = 1 \implies d = 0$ (State 4 only goes to 1)

State 1 goes to 4, and 4 goes to 1. They form a closed loop. Thus, **$\{1, 4\}$ is a closed, irreducible, recurrent class.**
What happens to states 2 and 3? State 3 goes to 2. State 2 goes to 3 (w.p $a$) and to 4 (w.p $1-a$).
- Subcase 1A ($a < 1$): State 2 can escape to state 4. Since $\{1,4\}$ is closed, 2 can never return. $\{2, 3\}$ are transient states. The only recurrent class is $\{1, 4\}$.
- Subcase 1B ($a = 1$): State 2 only goes to 3, and 3 only goes to 2. They form a closed loop. Thus, there are two separate recurrent classes: **$\{1, 4\}$ and $\{2, 3\}$**.

Case 2: $c < 1 \implies d > 0$ (State 4 can go to 2)

Because $d > 0$, the path $1 \to 4 \to 2$ exists. Thus, states $\{1,4\}$ can reach state 2. To determine if they form one big class, we must check if 2 can reach $\{1,4\}$. State 2 goes to 4 directly (w.p $1-a$).
- Subcase 2A ($a < 1$): State 2 can reach 4. Since 4 can reach 1, 2, and 2 can reach 3, all states communicate with each other. Thus, **$\{1, 2, 3, 4\}$ forms a single closed, irreducible, recurrent class.**
- Subcase 2B ($a = 1$): State 2 ONLY goes to 3. State 3 ONLY goes to 2. They form an inescapable trap. Thus, **$\{2, 3\}$ is a closed, irreducible, recurrent class.** Because states $\{1, 4\}$ will eventually hit this trap (via $4 \to 2$) and can never return, $\{1, 4\}$ are transient states.

Final Answer / Q.E.D:
(a) Valid when $a, c \in [0, 1]$, with $b = 1-a$ and $d = 1-c$.
(b) The closed, irreducible, recurrent classes are categorized as follows:
  • If $c=1, a<1$: {1, 4} (States 2,3 are transient).
  • If $c=1, a=1$: {1, 4} and {2, 3}.
  • If $c<1, a<1$: {1, 2, 3, 4} (The whole chain is irreducible).
  • If $c<1, a=1$: {2, 3} (States 1,4 are transient).

📌 Q4 Random Walks on a Square Graph (10 Marks)

Problem Statement: A particle located at a vertex of a square ABCD moves to one of the neighboring vertices with equal probability. Suppose that the particle starts moving from vertex A. Find the probability that it will visit each of the other three vertices at least once before returning to A.

🧠 Approach & Key Concepts

This problem models a Symmetric Random Walk on a Cycle Graph ($C_4$). We must find the probability of a specific sequential path condition.

✍️ Step-by-Step Proof / Derivation

Step 1: Analyzing the First Step

The vertices are arranged cyclically: A-B-C-D-A. Starting at A, the particle has two equally likely choices for step 1: go to B (w.p. $1/2$) or go to D (w.p. $1/2$).

Let $E$ be the event that the particle visits all other vertices before returning to A. By the Law of Total Probability and symmetry of the square:

$$P(E) = P(\text{Step 1 is B}) \times P(E \mid \text{Step 1 is B}) + P(\text{Step 1 is D}) \times P(E \mid \text{Step 1 is D})$$
$$P(E) = \frac{1}{2} P(E \mid B) + \frac{1}{2} P(E \mid D) = P(E \mid B)$$

Step 2: Necessary Condition from the First Neighbor

Assume the particle is now at B. To satisfy event $E$, the particle MUST visit C and D before returning to A. Because D is adjacent only to C and A, the only way to reach D from B without passing through A is to pass through C.

Therefore, the very first sub-goal from B is to reach C without hitting A. From B, the neighbors are A and C. Since the walk is symmetric, the probability of stepping to C on the very next step is $1/2$. If it steps to A, it fails. Thus, the probability of reaching C before A starting from B is strictly $1/2$.

Step 3: Calculating Absorption Probabilities from the Diagonal

Given that the particle successfully traversed B $\to$ C, it is now at vertex C. It has visited B and C. Now, it must reach D before returning to A.

Let $h(x)$ denote the probability of reaching D before reaching A, starting from vertex $x$. We define the boundary conditions based on the game's termination points:

$$h(D) = 1 \quad \text{(Success)}$$ $$h(A) = 0 \quad \text{(Failure)}$$

For the intermediate nodes B and C, the probability is the average of their neighbors:

$$h(C) = \frac{1}{2}h(B) + \frac{1}{2}h(D) = \frac{1}{2}h(B) + \frac{1}{2}$$
$$h(B) = \frac{1}{2}h(A) + \frac{1}{2}h(C) = \frac{1}{2}(0) + \frac{1}{2}h(C) = \frac{1}{2}h(C)$$

We substitute the expression for $h(B)$ back into the equation for $h(C)$:

$$h(C) = \frac{1}{2} \left( \frac{1}{2}h(C) \right) + \frac{1}{2}$$
$$h(C) = \frac{1}{4}h(C) + \frac{1}{2} \implies \frac{3}{4}h(C) = \frac{1}{2} \implies h(C) = \frac{2}{3}$$

Step 4: Combining the Probabilities

We can now calculate the total conditional probability $P(E \mid B)$. Because the random walk satisfies the strong Markov property, reaching C "resets" the process memory. The total probability is the product of the sequential necessary stages:

$$P(E \mid B) = P(\text{reach C before A from B}) \times P(\text{reach D before A from C})$$
$$P(E \mid B) = \left(\frac{1}{2}\right) \times h(C) = \left(\frac{1}{2}\right) \times \left(\frac{2}{3}\right) = \frac{1}{3}$$

Since we established in Step 1 that $P(E) = P(E \mid B)$:

$$P(E) = \frac{1}{3}$$
Final Answer / Q.E.D: The probability that the particle visits every other vertex before returning to A is exactly $\frac{1}{3}$.

📌 Q5 Dimension of the Vector Space of Magic-like Matrices (10 Marks)

Problem Statement: Find the dimension of the vector space of all $4 \times 4$ matrices whose row sums and column sums are all equal. Justify your answer.

🧠 Approach & Key Concepts

This problem asks for the dimension of a specific subspace of $M_{4 \times 4}(\mathbb{R})$. We can determine this by defining a linear transformation that maps a matrix to its row and column sums, and then applying the Rank-Nullity Theorem. The dimension of the overall space is 16. By counting the number of strictly independent linear constraints imposed by the row and column sum conditions, we can find the dimension of the resulting subspace.

✍️ Step-by-Step Proof / Derivation

Step 1: Formulating the Linear Constraints

Let $A = (a_{ij})$ be a $4 \times 4$ matrix. Let $R_i = \sum_{j=1}^4 a_{ij}$ be the sum of the $i$-th row, and $C_j = \sum_{i=1}^4 a_{ij}$ be the sum of the $j$-th column. We require all these 8 sums to be equal to some scalar $c$.

This generates the following equations:

$$R_1 = c, \quad R_2 = c, \quad R_3 = c, \quad R_4 = c$$ $$C_1 = c, \quad C_2 = c, \quad C_3 = c, \quad C_4 = c$$

To eliminate $c$ and find the direct constraints on the matrix elements, we can set all sums equal to $R_4$ (as a reference point). This yields 7 equations:

$$R_1 = R_4, \quad R_2 = R_4, \quad R_3 = R_4$$ $$C_1 = R_4, \quad C_2 = R_4, \quad C_3 = R_4, \quad C_4 = R_4$$

Step 2: Identifying Linear Dependencies

Are these 7 equations linearly independent? A fundamental property of any matrix is that the sum of all its elements can be calculated by either summing the row sums or summing the column sums. Thus:

$$\sum_{i=1}^4 R_i = \sum_{j=1}^4 C_j$$

We can isolate $C_4$ in this identity:

$$C_4 = R_1 + R_2 + R_3 + R_4 - (C_1 + C_2 + C_3)$$

If we assume the first 6 equations hold (i.e., $R_1=R_2=R_3=C_1=C_2=C_3=R_4$), substituting these into our isolated $C_4$ equation gives:

$$C_4 = R_4 + R_4 + R_4 + R_4 - (R_4 + R_4 + R_4) = 4R_4 - 3R_4 = R_4$$

This perfectly demonstrates that the 7th equation ($C_4 = R_4$) is automatically satisfied if the other 6 hold. Thus, it is linearly dependent. The remaining 6 equations are linearly independent because each one introduces a unique row or column sum not present in the others.

Step 3: Calculating the Dimension

The vector space of all $4 \times 4$ matrices has a dimension of $4 \times 4 = 16$.

We have established that the condition of having all row and column sums equal imposes exactly $6$ linearly independent constraints on the 16 variables.

$$\text{Dimension} = \text{Total Variables} - \text{Independent Constraints} = 16 - 6 = 10$$
💡 Generalization Check: The dimension of an $n \times n$ matrix with equal row and column sums is $(n-1)^2 + 1$. For $n=4$, this evaluates to $(4-1)^2 + 1 = 3^2 + 1 = 10$.
Final Answer / Q.E.D: The dimension of the vector space of all $4 \times 4$ matrices whose row sums and column sums are equal is 10.

📌 Q6 Zero-Inflated Poisson Distribution (10 Marks)

Problem Statement: Suppose $X \sim \text{Poisson}(\theta)$, and $Y$ is a random variable which equals $X$ with probability $0.5$, and equals $0$ with probability $0.5$.
(a) Find the coefficient of variation of $Y$.
(b) Show that $\text{Var}(Y) > E(Y)$.
(c) How would you estimate $\theta$ if 64 out of 100 observations for $Y$ are 0?

🧠 Approach & Key Concepts

This describes a Mixture Distribution (specifically, a zero-inflated Poisson model). We can model $Y$ mathematically using an indicator variable: $Y = I \cdot X$, where $I \sim \text{Bernoulli}(0.5)$ and $I$ is independent of $X$. We will use the properties of expectations and variances to evaluate $Y$, comparing it directly against the baseline Poisson properties.

✍️ Step-by-Step Proof / Derivation

Step 1: Finding Expectation and Variance for Part (a)

For $X \sim \text{Poisson}(\theta)$, we know $\mathbb{E}(X) = \theta$ and $\text{Var}(X) = \theta$. Therefore, $\mathbb{E}(X^2) = \text{Var}(X) + [\mathbb{E}(X)]^2 = \theta + \theta^2$.

By the Law of Total Expectation, the moments of $Y$ are a 50/50 mixture of the moments of $X$ and the constant 0:

$$\mathbb{E}(Y) = 0.5\mathbb{E}(X) + 0.5(0) = 0.5\theta$$
$$\mathbb{E}(Y^2) = 0.5\mathbb{E}(X^2) + 0.5(0) = 0.5(\theta + \theta^2)$$

We compute the variance of $Y$:

$$\text{Var}(Y) = \mathbb{E}(Y^2) - [\mathbb{E}(Y)]^2 = 0.5(\theta + \theta^2) - (0.5\theta)^2 = 0.5\theta + 0.5\theta^2 - 0.25\theta^2 = 0.25\theta^2 + 0.5\theta$$

The Coefficient of Variation (CV) is the standard deviation divided by the mean:

$$CV = \frac{\sqrt{\text{Var}(Y)}}{\mathbb{E}(Y)} = \frac{\sqrt{0.25\theta^2 + 0.5\theta}}{0.5\theta}$$

We can factor out $0.5\theta$ from inside the square root to simplify:

$$CV = \frac{\sqrt{0.25\theta^2(1 + \frac{2}{\theta})}}{0.5\theta} = \frac{0.5\theta \sqrt{1 + \frac{2}{\theta}}}{0.5\theta} = \sqrt{1 + \frac{2}{\theta}}$$

Step 2: Proving Overdispersion for Part (b)

We want to show that $\text{Var}(Y) > \mathbb{E}(Y)$. Using our derived expressions:

$$0.25\theta^2 + 0.5\theta > 0.5\theta$$

Subtracting $0.5\theta$ from both sides leaves:

$$0.25\theta^2 > 0$$

Since $\theta$ is the parameter of a strictly valid Poisson distribution, $\theta > 0$. Therefore, its square is strictly positive, making the inequality $0.25\theta^2 > 0$ strictly true. Thus, $\text{Var}(Y) > \mathbb{E}(Y)$.


Step 3: Estimating $\theta$ for Part (c)

We are given an empirical probability of 0, which is $\hat{p}_0 = 64/100 = 0.64$. We set up the theoretical probability $P(Y = 0)$ and equate it to this empirical finding (Method of Moments / MLE equivalent for mixtures).

$Y$ can be zero in two mutually exclusive ways: either the 0.5 probability event triggered the "equals 0" condition, or it triggered the $X$ condition AND the Poisson variable drew a zero.

$$P(Y = 0) = P(\text{Forced Zero}) + P(\text{Choose } X) \times P(X = 0)$$
$$P(Y = 0) = 0.5 + 0.5(e^{-\theta})$$

Equating this to our observed sample proportion:

$$0.5 + 0.5e^{-\hat{\theta}} = 0.64$$
$$0.5e^{-\hat{\theta}} = 0.14 \implies e^{-\hat{\theta}} = 0.28$$

Taking the natural logarithm of both sides:

$$-\hat{\theta} = \ln(0.28) \implies \hat{\theta} = -\ln(0.28) = \ln\left(\frac{1}{0.28}\right) \approx 1.273$$
Final Answer / Q.E.D:
(a) The Coefficient of Variation is $\sqrt{1 + 2/\theta}$.
(b) Because $\theta > 0$, the quadratic term strictly guarantees $\text{Var}(Y) = 0.25\theta^2 + \mathbb{E}(Y) > \mathbb{E}(Y)$.
(c) The estimate for $\theta$ is exactly $-\ln(0.28)$.

📌 Q7 Limits of Sequences via Uniform Continuity (10 Marks)

Problem Statement: Consider a function $f : (0, 1) \to \mathbb{R}$ which is differentiable on $(0, 1)$. It is also known that $|f'(x)| < M$, for all $x \in (0, 1)$, for some $M > 0$. Define $a_n = f\left(\frac{1}{n+1}\right)$, for $n \geq 1$. Does $\lim_{n \to \infty} a_n$ exist? Justify your answer.

🧠 Approach & Key Concepts

This problem evaluates the preservation of sequence convergence under a function mapping. Because the sequence inside the function $x_n = 1/(n+1)$ converges (it tends to 0), it is a Cauchy Sequence. We are given a bounded derivative, which, by the Mean Value Theorem, strictly implies a Lipschitz condition. Lipschitz continuous functions are uniformly continuous, and a fundamental property of uniform continuity is that it rigorously maps Cauchy sequences to Cauchy sequences.

✍️ Step-by-Step Proof / Derivation

Step 1: Establishing the Lipschitz Condition

We are given that $f$ is differentiable on $(0,1)$ and its derivative is bounded: $|f'(x)| < M$ for all $x \in (0,1)$.

Let $x, y \in (0,1)$ with $x \neq y$. By the Mean Value Theorem, there exists some $c$ strictly between $x$ and $y$ such that:

$$\frac{f(x) - f(y)}{x - y} = f'(c)$$

Taking the absolute value of both sides and applying the bound on the derivative:

$$|f(x) - f(y)| = |f'(c)| |x - y| < M |x - y|$$

This demonstrates that $f(x)$ is Lipschitz continuous on $(0,1)$. Any Lipschitz continuous function is strictly uniformly continuous.

Step 2: Identifying the Cauchy Sequence

Let $x_n = \frac{1}{n+1}$. As $n \to \infty$, $x_n \to 0$. Because the sequence $(x_n)$ converges in $\mathbb{R}$, it is a Cauchy sequence. By the formal definition of a Cauchy sequence, for any given $\delta > 0$, there exists an $N$ such that for all $m, n > N$:

$$|x_n - x_m| < \delta$$

Step 3: Proving $a_n$ is a Cauchy Sequence

We want to show that $a_n = f(x_n)$ is also a Cauchy sequence. Given any $\epsilon > 0$, we choose $\delta = \epsilon / M$. Because $(x_n)$ is Cauchy, there is an $N$ such that for all $m, n > N$, $|x_n - x_m| < \epsilon / M$.

Applying the Lipschitz condition derived in Step 1 to our sequence terms:

$$|a_n - a_m| = |f(x_n) - f(x_m)| \leq M |x_n - x_m| < M \left(\frac{\epsilon}{M}\right) = \epsilon$$

This proves that the sequence $(a_n)$ is a Cauchy sequence.

Step 4: Convergence in $\mathbb{R}$

The sequence $(a_n)$ is a Cauchy sequence of real numbers. The real number line $\mathbb{R}$ is a complete metric space. A defining property of complete metric spaces is that every Cauchy sequence converges to a limit within that space.

Final Answer / Q.E.D: Yes, the limit $\lim_{n \to \infty} a_n$ strictly exists. The bounded derivative ensures $f$ is uniformly continuous, guaranteeing it maps the Cauchy sequence $\frac{1}{n+1}$ to a resulting sequence that is also Cauchy, which must converge in $\mathbb{R}$.

📌 Q8 Expected Number of Runs in a Discrete Uniform Sample (10 Marks)

Problem Statement: Consider a sequence of $n$ observations where each is an independent realization of a Uniform$\{0, 1, \dots, 9\}$ random variable. A "run" is a subsequence of the same integer preceded and followed by a different integer. Let $R_n$ be the number of runs. Find $\mathbb{E}(R_n)$.

🧠 Approach & Key Concepts

This problem analyzes sequential combinatorial patterns. The most elegant way to solve expected value problems involving sequences (like "runs") is using Indicator Random Variables. A new run strictly begins at the very first observation, and subsequently, a new run begins any time an observation differs from the one immediately preceding it. By calculating the expected value of these "run-start" indicators and summing them via the Linearity of Expectation, we bypass complex combinatorial counting.

✍️ Step-by-Step Proof / Derivation

Step 1: Defining the Indicator Variables

Let $X_1, X_2, \dots, X_n$ be the sequence of independent Uniform$\{0, 1, \dots, 9\}$ random variables.

We define indicator variables $I_k$ to flag the start of a new run:

The total number of runs $R_n$ is simply the sum of all these indicator variables:

$$R_n = I_1 + \sum_{k=2}^n I_k = 1 + \sum_{k=2}^n I_k$$

Step 2: Calculating the Expected Value of an Indicator

Because the expectation of an indicator variable is simply the probability of the event it flags, we need to find $P(X_k \neq X_{k-1})$.

The random variables are uniformly drawn from the 10 integers $\{0, 1, \dots, 9\}$. Whatever specific integer value $X_{k-1}$ happens to be, there are exactly 9 out of 10 choices for $X_k$ that will be different. Because the draws are independent:

$$\mathbb{E}(I_k) = P(X_k \neq X_{k-1}) = \frac{9}{10} = 0.9$$

Step 3: Applying the Linearity of Expectation

We take the expected value of the total runs equation. The linearity of expectation holds regardless of any dependencies (though in this case, sequential transitions only depend on the immediate prior step anyway).

$$\mathbb{E}(R_n) = \mathbb{E}\left( 1 + \sum_{k=2}^n I_k \right) = 1 + \sum_{k=2}^n \mathbb{E}(I_k)$$

Since there are $(n-1)$ terms in the summation from $k=2$ to $n$, and each term has an expected value of $0.9$:

$$\mathbb{E}(R_n) = 1 + (n - 1) \left(\frac{9}{10}\right)$$

Step 4: Simplifying the Expression

Distribute the $0.9$ through the parentheses:

$$\mathbb{E}(R_n) = 1 + 0.9n - 0.9 = 0.1 + 0.9n$$
Final Answer / Q.E.D: By decomposing the sequence into independent run-start indicators, we easily find the expected number of runs in $n$ observations to be strictly $\mathbb{E}(R_n) = 0.9n + 0.1$.

📌 Q9 The Broken Stick Problem and Triangle Areas (10 Marks)

Problem Statement: Two points are chosen independently from Uniform(0,1) distribution to divide a line of unit length into three smaller line segments.
(a) Find the probability that a triangle can be formed using the segments.
(b) If they form a triangle, find the probability that the area of the triangle will be bigger than 1/8 square unit.

🧠 Approach & Key Concepts

This is a classic geometric probability problem.

✍️ Step-by-Step Proof / Derivation

Step 1: Probability of Forming a Triangle for Part (a)

Let the two random cut points be $X, Y \sim U(0,1)$. To avoid symmetry issues, let's assume $X < Y$. The three segments formed have lengths $L_1 = X$, $L_2 = Y - X$, and $L_3 = 1 - Y$.

By the triangle inequality, the sum of any two sides must be strictly greater than the third. Since $L_1 + L_2 + L_3 = 1$, this condition is mathematically equivalent to stating that every individual segment must be strictly less than $1/2$:

$$L_1 < 1/2 \implies X < 1/2$$ $$L_2 < 1/2 \implies Y - X < 1/2 \implies Y < X + 1/2$$ $$L_3 < 1/2 \implies 1 - Y < 1/2 \implies Y > 1/2$$

The sample space for $(X,Y)$ given $X < Y$ is a triangle with vertices $(0,0), (1,1), (0,1)$ which has an area of $1/2$. The success region satisfying the inequalities above is the triangle with vertices $(0, 1/2), (1/2, 1), (1/2, 1/2)$, which has an area of $1/8$.

The probability is the ratio of the success area to the total area:

$$P(\text{Triangle}) = \frac{1/8}{1/2} = \frac{1}{4}$$

Step 2: Evaluating the Maximum Triangle Area for Part (b)

We are given that the three segments form a triangle. The perimeter is fixed at $2s = L_1 + L_2 + L_3 = 1$, so the semi-perimeter is $s = 1/2$.

By Heron's formula, the area $A$ of the triangle is:

$$A = \sqrt{s(s-L_1)(s-L_2)(s-L_3)} = \sqrt{\frac{1}{2}\left(\frac{1}{2}-L_1\right)\left(\frac{1}{2}-L_2\right)\left(\frac{1}{2}-L_3\right)}$$

To maximize this product subject to the sum constraint $(1/2-L_1) + (1/2-L_2) + (1/2-L_3) = 3/2 - 1 = 1/2$, we apply the AM-GM inequality. The product is maximized when all factors are equal, which corresponds to an equilateral triangle ($L_1 = L_2 = L_3 = 1/3$).

The maximum possible area is:

$$A_{max} = \sqrt{\frac{1}{2} \left(\frac{1}{2} - \frac{1}{3}\right)^3} = \sqrt{\frac{1}{2} \left(\frac{1}{6}\right)^3} = \sqrt{\frac{1}{432}} = \frac{\sqrt{3}}{36}$$

Let's evaluate this maximum value. Since $\sqrt{3} \approx 1.732$, $A_{max} \approx 1.732 / 36 \approx 0.0481$ square units.

Step 3: Comparing to the Threshold

The problem asks for the probability that the area is strictly greater than $1/8$. Note that $1/8 = 0.125$.

$$0.0481 \ll 0.125 \implies A_{max} < \frac{1}{8}$$

Since the absolute mathematical maximum area of any such triangle is strictly less than $1/8$, it is physically impossible to form a triangle with an area greater than $1/8$ under these constraints.

Final Answer / Q.E.D:
(a) The probability that a triangle can be formed is $\frac{1}{4}$.
(b) Because the maximum possible area of a triangle with perimeter 1 is $\frac{\sqrt{3}}{36}$ (which is less than $1/8$), the probability that the area is greater than 1/8 is exactly $0$.

📌 Q10 Weak Law of Large Numbers for Trigonometric Transformations (10 Marks)

Problem Statement: Let $\theta_i$ be independent and identically distributed Uniform$(0, 2\pi)$, and define $Y_i = |\sin \theta_i|$. Find the limit to which $\frac{1}{n} \sum_{i=1}^n Y_i$ converges in probability.

🧠 Approach & Key Concepts

This problem is a direct application of the Weak Law of Large Numbers (WLLN). Because the $\theta_i$ are i.i.d., the transformed variables $Y_i = g(\theta_i)$ are also i.i.d. The sample mean of $Y_i$ will converge in probability to the population expectation $\mathbb{E}(Y_1)$, provided the expectation is finite. We calculate this expectation using standard trigonometric integration.

✍️ Step-by-Step Proof / Derivation

Step 1: Applying the WLLN

By the Weak Law of Large Numbers, for a sequence of i.i.d random variables with finite expectation, the sample mean converges in probability to the true expected value:

$$\frac{1}{n} \sum_{i=1}^n Y_i \xrightarrow{p} \mathbb{E}(Y_1)$$

Step 2: Formulating the Expected Value

The variable $\theta_1$ follows a Uniform distribution on $(0, 2\pi)$. Its probability density function is $f(\theta) = \frac{1}{2\pi}$ for $0 < \theta < 2\pi$.

The expected value of $Y_1 = |\sin \theta_1|$ is:

$$\mathbb{E}(|\sin \theta_1|) = \int_0^{2\pi} |\sin \theta| \frac{1}{2\pi} d\theta$$

Step 3: Evaluating the Integral utilizing Symmetry

The function $|\sin \theta|$ is highly symmetric. It repeats its shape identically in four quadrants over the interval $[0, 2\pi]$ (from $0$ to $\pi/2$, $\pi/2$ to $\pi$, etc.). In the first quadrant $[0, \pi/2]$, $\sin \theta \geq 0$, so $|\sin \theta| = \sin \theta$.

We can exploit this symmetry to simplify the integral by evaluating it over just the first quadrant and multiplying by 4:

$$\mathbb{E}(Y_1) = \frac{4}{2\pi} \int_0^{\pi/2} \sin \theta d\theta = \frac{2}{\pi} \int_0^{\pi/2} \sin \theta d\theta$$

Compute the definite integral:

$$\int_0^{\pi/2} \sin \theta d\theta = \Big[ -\cos \theta \Big]_0^{\pi/2} = (-\cos(\pi/2)) - (-\cos(0)) = 0 - (-1) = 1$$

Step 4: Final Substitution

Substitute the evaluated integral back into the expectation formula:

$$\mathbb{E}(Y_1) = \frac{2}{\pi} \times 1 = \frac{2}{\pi}$$
Final Answer / Q.E.D: By the Weak Law of Large Numbers, the sample mean converges in probability to the population expectation. Therefore, $\frac{1}{n} \sum_{i=1}^n Y_i \xrightarrow{p} \frac{2}{\pi}$.

📌 Q11 Warner's Randomized Response Model (10 Marks)

Problem Statement: To estimate the probability of smoking, 300 students draw one of 40 cards with replacement. 25 cards say 'I smoke', 15 say 'I do not smoke'. Students answer 'yes' or 'no' without disclosing the card. Let $A$ be answering 'yes', $S$ be the event a student smokes.
(a) Establish a relationship between $P(A)$ and $P(S)$.
(b) If 130 say 'yes', find an estimate for $P(S)$ and check if it is unbiased.
(c) Find the variance of the estimator of $P(S)$.

🧠 Approach & Key Concepts

This problem deals with the Randomized Response Technique, used in surveys for sensitive questions. By introducing controlled random noise (the cards), respondents feel secure. We use the Law of Total Probability to link the observed "yes" answers to the true underlying parameter $P(S)$. By inverting this relationship, we construct the Method of Moments estimator and calculate its variance using basic properties of the Binomial proportion.

✍️ Step-by-Step Proof / Derivation

Step 1: Law of Total Probability for Part (a)

A student answers 'yes' ($A$) under exactly two mutually exclusive scenarios:

  1. They draw the 'I smoke' card AND they actually smoke.
  2. They draw the 'I do not smoke' card AND they do NOT smoke.

Let $p_1 = 25/40 = 5/8$ be the probability of drawing the 'I smoke' card. Let $p_2 = 15/40 = 3/8$ be the probability of drawing the 'I do not smoke' card. Assuming card drawing is independent of smoking status:

$$P(A) = P(\text{Card 1})P(S) + P(\text{Card 2})P(S^c)$$
$$P(A) = \frac{5}{8}P(S) + \frac{3}{8}(1 - P(S)) = \frac{5}{8}P(S) + \frac{3}{8} - \frac{3}{8}P(S)$$
$$P(A) = \frac{1}{4}P(S) + \frac{3}{8}$$

Step 2: Estimator and Unbiasedness for Part (b)

We are given that 130 out of 300 students said 'yes'. This gives us an unbiased estimator for $P(A)$, which is the sample proportion: $\hat{P}(A) = 130 / 300 = 13/30$.

We invert the relationship from part (a) to solve for $\hat{P}(S)$:

$$\hat{P}(A) = \frac{1}{4}\hat{P}(S) + \frac{3}{8} \implies \frac{1}{4}\hat{P}(S) = \hat{P}(A) - \frac{3}{8}$$
$$\hat{P}(S) = 4\hat{P}(A) - \frac{3}{2}$$

Plugging in our sample proportion:

$$\hat{P}(S) = 4\left(\frac{13}{30}\right) - \frac{3}{2} = \frac{52}{30} - \frac{45}{30} = \frac{7}{30}$$

To check for unbiasedness, we take the expectation of the estimator:

$$\mathbb{E}[\hat{P}(S)] = \mathbb{E}\left[ 4\hat{P}(A) - \frac{3}{2} \right] = 4\mathbb{E}[\hat{P}(A)] - \frac{3}{2}$$

Since $\hat{P}(A)$ is a sample proportion from a Binomial distribution, it is unbiased: $\mathbb{E}[\hat{P}(A)] = P(A)$. Substituting the relationship from part (a):

$$\mathbb{E}[\hat{P}(S)] = 4\left(\frac{1}{4}P(S) + \frac{3}{8}\right) - \frac{3}{2} = P(S) + \frac{3}{2} - \frac{3}{2} = P(S)$$

The estimator is strictly unbiased.


Step 3: Calculating Variance for Part (c)

We apply the variance operator to our estimator equation $\hat{P}(S) = 4\hat{P}(A) - 1.5$. Constants drop out, and the scalar is squared:

$$\text{Var}(\hat{P}(S)) = 16 \cdot \text{Var}(\hat{P}(A))$$

The variance of a sample proportion for a sample of size $n$ is $\frac{P(A)(1-P(A))}{n}$. Here $n = 300$.

$$\text{Var}(\hat{P}(S)) = 16 \frac{P(A)(1-P(A))}{300} = \frac{4}{75} P(A)(1-P(A))$$

To provide a numerical estimate of this variance based on our sample, we plug in $\hat{P}(A) = 13/30$:

$$\widehat{\text{Var}}(\hat{P}(S)) = \frac{4}{75} \left(\frac{13}{30}\right)\left(\frac{17}{30}\right) = \frac{4 \times 221}{75 \times 900} = \frac{884}{67500} \approx 0.0131$$
Final Answer / Q.E.D:
(a) The relationship is $P(A) = \frac{1}{4}P(S) + \frac{3}{8}$.
(b) The estimate is $\hat{P}(S) = \frac{7}{30}$ (approx 23.33%). It is strictly an unbiased estimator.
(c) The theoretical variance is $\frac{16 P(A)(1-P(A))}{300}$, which evaluates to an estimated variance of $\approx 0.0131$.

📌 Q12 Maximum Likelihood Estimation of Shooter Probabilities (10 Marks)

Problem Statement: A novice shooter hits a target with probability $\theta$ ($\theta > 0.5$) at 10m, and probability $1-\theta$ at 25m. In a practice session, they shoot from both distances. Total hits per pair are recorded over 100 trials: 0 hits (freq 23), 1 hit (freq 58), 2 hits (freq 19). Find the Maximum Likelihood Estimate (MLE) of $\theta$.

🧠 Approach & Key Concepts

This is a Multinomial Likelihood problem. A single trial consists of a pair of independent shots (one at 10m, one at 25m). The number of hits in a pair $Z \in \{0, 1, 2\}$ is the sum of two non-identical Bernoulli variables. We must derive the probability mass function for $Z$ in terms of $\theta$, construct the log-likelihood function for the grouped data, and optimize it using standard calculus, being careful to enforce the bounds $\theta > 0.5$.

✍️ Step-by-Step Proof / Derivation

Step 1: Deriving the Probabilities for $Z$

Let $X_{10} \sim \text{Bernoulli}(\theta)$ and $X_{25} \sim \text{Bernoulli}(1-\theta)$. Total hits $Z = X_{10} + X_{25}$.

Step 2: Constructing the Likelihood Function

Let $n_0 = 23$, $n_1 = 58$, and $n_2 = 19$. The likelihood function is proportional to:

$$L(\theta) \propto [P(Z=0)]^{n_0} [P(Z=1)]^{n_1} [P(Z=2)]^{n_2}$$

Substitute the derived probabilities:

$$L(\theta) \propto [\theta(1-\theta)]^{23} [2\theta^2 - 2\theta + 1]^{58} [\theta(1-\theta)]^{19}$$

Combine the exponents for the identical $Z=0$ and $Z=2$ terms ($23 + 19 = 42$):

$$L(\theta) \propto [\theta(1-\theta)]^{42} [1 - 2\theta(1-\theta)]^{58}$$

Step 3: Reparameterization and Optimization

Notice that the entire likelihood function depends entirely on the block term $p = \theta(1-\theta)$. We can rewrite the likelihood in terms of $p$ to simplify the calculus:

$$L(p) \propto p^{42} (1 - 2p)^{58}$$

Take the natural logarithm to form the log-likelihood:

$$\ln L(p) = 42 \ln(p) + 58 \ln(1 - 2p)$$

Take the derivative with respect to $p$ and set it to 0 to find the maximum:

$$\frac{d}{dp} \ln L(p) = \frac{42}{p} - \frac{116}{1 - 2p} = 0$$

Cross-multiply to solve for $p$:

$$42(1 - 2p) = 116p \implies 42 - 84p = 116p \implies 200p = 42 \implies p = 0.21$$

Step 4: Solving for $\theta$

We substitute $p = 0.21$ back into our reparameterization identity:

$$\theta(1-\theta) = 0.21 \implies \theta^2 - \theta + 0.21 = 0$$

This is a standard quadratic equation. Factoring it:

$$(\theta - 0.7)(\theta - 0.3) = 0$$

The roots are $\theta = 0.7$ and $\theta = 0.3$. The problem strictly stipulates that $\theta > 0.5$. Therefore, we reject the lower root.

Final Answer / Q.E.D: By framing the likelihood around the symmetric product parameter $\theta(1-\theta)$ and rejecting the invalid root, the exact Maximum Likelihood Estimate is $\hat{\theta} = 0.7$.

📚 Paper Summary & Key Focus Areas

🎯 Core Concepts Tested in This Paper