221k views
1 vote
a) LOOCV is not random because you fit all n possible models that leave out a single observation. k-fold CV is random because we randomly split the data into k chunks. How many models would we need to fit to exhaustively obtain MSPE for every excluded chunk of size n/k? b) For the data-set "Puromycin", manually carry out LOOCV to compare two models predicting rate, one with conc and one with conc and state. (Don't forget that you can subset data via something like data[1,] to only look at the first row, or data(-1,] to remove the first row. You may find a loop (help(for)) useful but not necessary.) Submit the output of your R.Markdown. You should clearly report the RMSPELOO for both models and determine which model LOOCV chooses.

1 Answer

7 votes

Answer:

1. Randomly divide the available set of observations into two

parts, a training set and a validation set or hold-out set.

2. Fit the model on the training set.

3. Use the resulting fitted model to predict the responses for the

observations in the validation set.

4. The resulting validation set error rate is typically assessed using

the MSE in the case of a quantitative response. This provides

an estimate of the test error rate.

User Dd Pp
by
5.3k points