162k views
0 votes
if you sample k out of m features, what is expected number of turns until all features have been sampled

User Danthelion
by
8.5k points

1 Answer

4 votes

Final answer:

The question is about the expected number of turns to sample all features from a set, known as the coupon collector's problem. For sampling one feature at a time, the expected number of turns is the product of the number of features and the corresponding harmonic number. For more than one feature sampled at a time, computational or complex combinatorial methods might be needed to approximate or calculate the exact expectation.

Step-by-step explanation:

The question you're asking relates to a problem in probability often referred to as the coupon collector's problem. The crux of the problem is figuring out the expected number of trials (or turns) to collect a complete set of 'coupons', which, in your case, correspond to the m different features. Each time you sample without replacement k out of the m features, you get a subset of the features, and this process repeats until you've collected all features at least once.

To solve the problem, we look at the expected turns E(T) to collect all m features. It is not simple to provide an exact formula for E(T) when k features are sampled at once without replacement because it depends on how the subsets of k features overlap as you sample. If k was 1, then E(T) would be m * H_m, where H_m is the m-th harmonic number. However, with k > 1, the calculations become more complex, involving combinations and probabilities that depend on what has already been chosen in previous turns.

To reach an expected value, one could use computational methods like simulations to approximate the answer when k > 1 is involved, or delve into more complicated combinatorial arguments for an exact solution. It's important to note that this is a non-trivial problem that typically falls within the realm of higher-level mathematics or statistics courses.