45.2k views
0 votes
A 10-year study conducted by the American Heart Association provided data on how age, blood pressure, and smoking relate to the risk of strokes. Data from a portion of this study are contained in the Excel Online file below. "Risk" is interpreted as the probability (times 100) that a person will have a stroke over the next 10-year period. For the smoker variable, 1 indicates a smoker, and 0 indicates a nonsmoker.

Part A:
Open a spreadsheet and perform the following tasks:

a. Develop an estimated regression equation that can be used to predict the risk of stroke given the age and blood pressure level. Please report R^2 and the adjusted R-squared (R-sq adj) between 0 and 1. The regression equation should be in the form "Risk = (to 2 decimals) Age + Pressure." Provide the values for R^2, R-sq adj, and round them to 3 decimals each.

Part B:
b. Consider adding two independent variables to the model developed in part (a): one for the interaction between age and blood pressure level and the other for whether the person is a smoker. Develop an estimated regression equation using these four independent variables. Again, report R^2 and the adjusted R-squared (R-sq adj) between 0 and 1. The regression equation should be in the form "Risk = (to 2 decimals) Age + Pressure + Smoker + AgePress." Provide the values for R^2, R-sq adj, and round them to 3 decimals each.

Part C:
c. At a 0.05 level of significance, test to see whether the addition of the interaction term and the smoker variable contribute significantly to the estimated regression equation developed in part (a). Calculate the value of the F test statistic (round to 4 decimals) and the p-value (round to 4 decimals). Determine if the addition of the two independent variables is statistically significant.

User Vesnog
by
7.6k points

1 Answer

3 votes

Final answer:

The task involves conducting a regression analysis to predict stroke risk based on age, blood pressure, and smoking status, and testing the significance of adding additional variables to the model at the college level in the field of mathematics.

Step-by-step explanation:

The question requires performing a regression analysis to predict the risk of stroke based on the given variables and testing the statistical significance of additional factors in the model. For Part A, a simple linear regression is used with age and blood pressure as independent variables. The equation would have the form of Risk = a + b1(Age) + b2(Pressure), where a is the intercept, b1 and b2 are coefficients for Age and Pressure respectively. For Part B, an extended regression includes an interaction term between age and blood pressure, and a dummy variable for smokers. The equation would be Risk = a + b1(Age) + b2(Pressure) + b3(Smoker) + b4(AgePress). Finally, for Part C, an F-test would help determine if the new variables significantly improve the model, using a 0.05 level of significance for hypothesis testing.

User Zayra
by
7.3k points