68.8k views
1 vote
in this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the auto data set. 1. create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median. 2. fit a support vector classifier to the data with various values of cost, in order to predict whether a car gets high or low gas mileage. report the cross-validation errors associated with different values of this parameter. comment on your results. 3. now repeat 2., this time using svms with radial and polynomial basis kernels, with different values of gamma and degree and cost. comment on your results. 4. make some plots to back up your assertions in 2. and 3.. hint: when p > 2, you can use the plot() function to create plots displaying pairs of variables at a time. essentially, instead of typing plot(svmfit , dat) where svmfit contains your fitted model and dat is a data frame containing your data, you can type plot(svmfit , dat , x1~x4) in order to plot just the first and fourth variables. however, you must replace x1 and x4 with the correct variable names. to find out more, type ? .

User Xeevis
by
8.3k points

1 Answer

3 votes

Final answer:

To solve this problem, you need to create a binary variable based on the gas mileage, fit a support vector classifier with different values of cost, and analyze the cross-validation errors. Then, repeat the process using SVMs with radial and polynomial basis kernels. Visualize the findings through plots.

Step-by-step explanation:

Subject: Computers and Technology
Grade: College

1. To create a binary variable, you need to find the median of the gas mileage and then assign a value of 1 to cars with gas mileage above the median, and a value of 0 to cars with gas mileage below the median.

2. Fit a support vector classifier to the data with different values of the cost parameter. Calculate the cross-validation errors for each value of the cost parameter and analyze the results to determine the best value of cost.

3. Repeat step 2, but this time use support vector machines (SVMs) with radial and polynomial basis kernels. Use different values of gamma, degree, and cost. Analyze the cross-validation errors to compare the performance of different kernels.

4. Create plots to visually support your findings from steps 2 and 3. Use the plot() function to display pairs of variables and observe the relationship between them.

User Vahe Tshitoyan
by
6.8k points