438 views
4 votes
An important first step before running a regression model is to compile a comprehensive list of potential predictor variables. How can we reduce the list to a smaller list of predictor variables?

A. The best approach may be to do nothing
B. We must include all relevant variables
C. Use the adjusted R2 criterion to reduce the list
D. We use R to make the necessary correction

User Ryan Skene
by
7.9k points

1 Answer

7 votes

Final answer:

To reduce the list of potential predictor variables for a regression model, one should employ the adjusted R-squared criterion, draw scatter plots, and calculate the least-squares line and correlation coefficient. The correct option is C to use the adjusted R-squared criterion to refine the list.

Step-by-step explanation:

In reducing the list of potential predictor variables for a regression model, various techniques can be employed. One commonly used approach is to use the adjusted R-squared criterion, which takes into account the number of predictors and helps to prevent overfitting. Another method is to employ stepwise selection, where variables are added or removed based on certain criteria, such as the Akaike information criterion (AIC) or the Bayesian information criterion (BIC).

When preparing for variable selection, it is crucial to:

  • Determine which variable is independent (predictor) and which is dependent (outcome).
  • Draw a scatter plot to visualize the relationship between variables.
  • Analyze the scatter plot to decide if there appears to be a relationship.
  • Calculate the least-squares line and put the equation in the form ÿ = a + bx.
  • Compute and assess the significance of the correlation coefficient.

For prediction, after verifying a significant correlation, you can use the least-squares regression line to make informed predictions.

Regarding the initial question, the correct option is C. Use the adjusted R-squared criterion to reduce the list of predictor variables. While it is essential to include all relevant variables, excessive inclusion can lead to overfitting; hence, refinement is often necessary with the right statistical tools.

User Julien Lachal
by
8.7k points