42.4k views
3 votes
The Ladies Professional Golfers Association (LPGA) maintains statistics on performance and earnings for members of the LPGA Tour. Year-end performance statistics for the 30 players who had the highest total earnings in LPGA Tour events for 2005 appear on the data disk in the file named LPGATour2 (www.lpga.com, 2006). Earnings ($1000) is the total earnings in thousands of dollars; Scoring Avg. is the average score for all events; Drive Average is the average length of a players drive in yards; Greens in Reg. is the percentage of time a player is able to hit the green in regulation; Putting Avg. is the average number of putts taken on greens hit in regulation; and Sand Saves is the percentage of time a player is able to get "up and down" once in a greenside sand bunker. A green is considered hit in regulation if any part of the ball is touching the putting surface and the difference between the value of par for the hole and the number of strokes taken to hit the green is at least 2. Let DriveGreens denote a new independent variable that represents the interaction between the average length of a player's drive and the percentage of time a player is able to hit the green in regulation. Use the methods in this section to develop the best estimated multiple regression equation for predicting a player's average score for all events.

Player Scoring Avg. Drive Average Greens in Reg. Putting Avg. Sand Saves DriveGreens
Annika Sorenstam 69.33 263 0.772 1.75 0.595 203.036
Paula Creamer 70.98 248.6 0.727 1.75 0.468 180.732
Cristie Kerr 70.86 255.5 0.722 1.76 0.362 184.471
Lorena Ochoa 71.39 261.7 0.697 1.75 0.31 182.405
Jeong Jang 71.17 244.8 0.71 1.79 0.485 173.808
Natalie Gulbis 71.24 252.9 0.709 1.78 0.343 179.306
Meena Lee 72.32 238.2 0.686 1.82 0.422 163.405
Hee-Won Han 71.31 241.7 0.707 1.78 0.444 170.882
Gloria Park 71.43 242 0.7 1.79 0.426 169.4
Catriona Matthew 71.46 251.4 0.696 1.78 0.443 174.974
Candie Kung 71.52 247.7 0.702 1.85 0.393 173.885
Marisa Baena 71.92 251.1 0.684 1.79 0.446 171.752
Birdie Kim 73.16 240 0.679 1.86 0.386 162.96
Soo-Yun Kang 71.8 241.2 0.631 1.77 0.581 152.197
Lorie Kane 72.28 245.7 0.718 1.84 0.475 176.413
Heather Bowie 71.46 258.3 0.742 1.82 0.455 191.659
Wendy Ward 72.14 246.7 0.707 1.81 0.413 174.417
Pat Hurst 71.47 259.3 0.709 1.77 0.36 183.844
Christina Kim 71.66 254 0.718 1.82 0.307 182.372
Rosie Jones 71.58 230.9 0.662 1.8 0.435 152.856
Carin Koch 71.59 250.2 0.699 1.79 0.408 174.89
Liselotte Neumann 71.47 249.1 0.679 1.81 0.322 169.139
Mi Hyun Kim 71.65 237.4 0.674 1.8 0.25 160.008
Juli Inkster 71.33 251.2 0.701 1.79 0.375 176.091
Michele Redman 71.59 244.6 0.686 1.81 0.386 167.796
Jennifer Rosales 71.85 252.1 0.705 1.81 0.417 177.731
Karrie Webb 71.52 256.2 0.709 1.81 0.353 181.646
Sophie Gustafson 72.59 269.2 0.651 1.81 0.389 175.249
Young Kim 71.7 250.7 0.678 1.79 0.292 169.975
Karine Icher 72.13 244 0.728 1.76 0.222 177.632

Apply all of the three One-Variable-at-a-Time procedures. Make sure to show your work if you did any calculation, and Minitab output if you used Minitab.
1). Develop the estimated multiple regression equation for predicting a player’s average score for all events by using the three One-Variable-at-a-Time procedures (that is, forward, backward, and stepwise procedure).
2). Are the estimated regression equations the same among the three methods?

User Drmuelr
by
7.7k points

1 Answer

5 votes

Final answer:

To predict a player's average score using the LPGA Tour data, calculate an interaction term, DriveGreens, and perform a multiple regression analysis that includes this and other relevant variables.

Step-by-step explanation:

The data provided from the LPGA Tour contains valuable information that could help to develop a multiple regression equation for predicting a player's average score in golf events. To create the independent variable DriveGreens, which represents the interaction between 'Drive Average' and 'Greens in Reg.', you would multiply these two statistics together for each player.

For instance, Wendy Ward's DriveGreens value is calculated as 246.7 (Drive Average) times 0.707 (Greens in Reg.), resulting in 174.417. This interaction term can then be included in a regression model along with other variables such as Earnings, Putting Avg., and Sand Saves to better understand the determinants of Scoring Avg.

To derive the best estimated multiple regression equation, one would use statistical software to perform a regression analysis, including the new DriveGreens variable and potentially others. This analysis would reveal the relationship between these variables and the Scoring Avg. allowing us to predict players' performances.

User Katzmopolitan
by
7.3k points