711 views
3 votes
If y1, y2, . . . , yn denote a random sample from a geometric distribution with parameter p, show that y is sufficient for p.

1 Answer

3 votes

Before we start with the solution, let's recap a few basic concepts.

In probability theory and statistics, the geometric distribution is either of two discrete probability distributions:

- The probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set { 1, 2, 3, ...}

- The probability distribution of the number Y = X − 1 of failures before the first success, supported on the set { 0, 1, 2, 3, ... }

Both of these distributions satisfy the property of being memoryless.

A statistic T(X1, ..., Xn) is said to be a sufficient statistic for a parameter θ if the conditional probability distribution of the data, given the statistic, does not depend on the parameter. That's what we want to prove for a geometrically distributed random sample and its sum.

Let's start the proof:

First, the relevant probability mass function is given by
P(Y = y) = (1 - p)^(y-1) * p, for y = 1, 2, 3, ...

Therefore, the join probability mass function of our sample y1, y2, ..., yn is given by

f(y1, y2, ..., yn; p) = (1 - p)^(y1 + y2 +. . . + yn - n) * p^n

Which equals to [p/(1 - p)]^n * (1 - p)^(y1 + y2 +. . . + yn)

Now, you could tell me that this expression seems to depend not only on the sum Y = y1 + y2 +. . . + yn of the random variables, but also in the number n of addends. However, in the context of this problem it's fair to assume that the number n of trials is known, and therefore, we have a function that only depends on the parameter p and the sum Y of the random variables.

Hence, Y = y1 + y2 +. . . + yn is a sufficient statistic for the parameter p in a geometrically distributed random sample.

Note: As a complement, it's worth noting that the sum of geometrically distributed random variables follows a negative binomial distribution. The expectation and the variance of a negative binomial distribution only are functions of the parameter p.

User Dewd
by
8.5k points