69.2k views
0 votes
Suppose that we want to select between two prediction models, M, and M2. We have performed 10 rounds of 10-fold cross-validation on each model, where the same data partitioning in round i is used for both M, and M2. The error rates obtained from M₁ are 30.5, 32.2, 20.7, 20.6, 31.0, 41.0, 27.7, 26.0, 21.5, 26.0. The error rates for M₂ are 22.4, 14.5, 22.4, 19.6, 20.7, 20.4, 22.1, 19.4, 16.2, 35.0.

Comment on whether one model is significantly better than the other considering a significance level of 1%.
Hint: Make yourself familiar with the Student's t-test.

User Merlino
by
8.3k points

1 Answer

4 votes

Final answer:

Using the Student's t-test, we compare the mean error rates and standard deviations of the two prediction models. The t-value falls within the range of the critical value, indicating that there is no significant difference between the models.

Step-by-step explanation:

To determine if one model is significantly better than the other, we can use the Student's t-test. First, calculate the mean error rates for each model. The mean error rate for M₁ is 27.96% and for M₂ is 21.77%. Next, calculate the standard deviation for each model. The standard deviation for M₁ is 6.816 and for M₂ is 5.641. Then, calculate the t-value using the formula t = (mean₁ - mean₂) / sqrt((std₁²/n) + (std₂²/n)), where n is the number of rounds (10 in this case). The t-value is -1.97.

Now, we need to compare the t-value with the critical value for a significance level of 1% (two-tailed test). The critical value for a two-tailed t-test with 10 degrees of freedom and alpha = 0.01 is approximately ± 3.169. Since the t-value (-1.97) falls within the range of ± 3.169, we fail to reject the null hypothesis. This means there is not enough evidence to conclude that one model is significantly better than the other.

User Khaled Jamal
by
8.1k points