95.2k views
4 votes
The file "berkeley.dat" contains average yearly temperatures for the cities of Berkeleyand Santa Barbara. Import the data into R using the following commandsberk<-scan("berkeley.dat", what=list(double(0),double(0),double(0)))time<-berk[[1]]berkeley<-berk[[2]]stbarb<-berk[[3]](a) Plot the variablesberkeleyandstbarbversustime. Also, plotberkeleyversusstbarb.(b) Perform a regression ofberkeleyontime. What do you think about this fit? Besure to make diagnostic plots (including ACF) of the residuals. If there are anyviolations of the assumptions for a linear regression model, make sure to commenton them.(c) Perform a regression ofberkeleyonstbarb. Comment on the fit and the resid-uals.(d) Make a time series plot of the variableberkeleyand an ACF plot of the data.Does the time series appear to be stationary? Explain. Interpret the ACF plotin this situation.(e) Difference the data. The commanddiff(berkeley)will produce the differenceddata. Plot this differenced data, and make an ACF plot. What is your opinion ofwhether the series is stationary after differencing?(f) Now, we have detrended this series by using linear regression and with differenc-ing. The result of detrending via regression was a model that fit rather well andresiduals that had no apparent dependency.Let us assume then that the truemodel for this data isxt=β1+β2t+wt4

wherewt, t= 1, ..., Tis Gaussian white noise with varianceσ2w. (This is the sameas assuming that this data follows the standard regression assumptions.) Assum-ing this model, describe the model (i.e. write out a formula) for the differencedtime series,∇xt. Use this to explain the apparent dependency in the differenceddata from 1e above

User Roadblock
by
7.1k points

1 Answer

4 votes

Answer:

a. A plot of berkeley vs time and berkeley vs stbarb can be found in the 1st and 2nd image attachments

b. A plot of the residual diagnostics regression of berkeley on time can be found in the 3rd image attachment

c. The residuals obtained after a regression of berkeley and stbarb seem like a reasonable fit as they do not strongly violate the regression assumptions. The ACF plot for the residuals show a correlation at lag 11 that is larger than what is expected if the residuals were independent. However, it is not extremely large and is most likely due to regular variation. A plot of this regression can be found in the fourth image attachment.

d. A plot of the variable berkeley and an ACF plot of the data can be found as the first diagram in the fifth image attachment. The time series here has an increasing trend which means that it could not possibly be stationary. The ACF plot is difficult to interpret since the data is not stationary. It cannot be interpreted as an approximation to the auto-correlation function.

e. A plot of the differenced data and its corresponding ACF plot can be found as the second diagram in the fifth image attachment.

The data here seems to be fairly stationary after differencing with a fairly constant variance and no discernible trend.

f. The formula for the differenced time series would be


x_1 - x_(t-1) = \beta _1 + w_t - w_(t-1)

The differenced data has a constant mean. The ACF at lag one is negative and significantly outside the confidence intervals. The other lags show no or weak dependency

Explanation:

I have attached images of this solution to represent the various constraints plotted against each other.

a. The plots of berkeley vs time and berkeley vs stbarb are contained in the first and second image attachments.

b. The residual diagnostics plots of a regression of berkeley on time(including the ACF) are contained in the third image attachment.

Inferences can be made thus:

Residual standard error: 0.4539 on 102 degrees of freedom

Multiple R-Squared: 0.4015

Adjusted R-Squared: 0.3956

F-Statistic: 68.43 on 1 and 102 degrees of freedom

The F test therefore indicates a strong relationship

c. The residual diagnostics plots of a regression of berkeley on stbarb(including the ACF) are contained in the fourth image attachment.

Inferences can be made thus:

Residual standard error: 0.5221 on 102 degrees of freedom

Multiple R-Squared: 0.2079

Adjusted R-Squared: 0.2001

F-Statistic: 26.77 on 1 and 102 degrees of freedom

The residuals here do not strongly violate the regression assumptions. The ACF plot for the residuals show a correlation at lag 11 that is larger than what is expected if the residuals were independent. However, it is not extremely large and is most likely due to regular variation.

d. A plot of the variable berkeley and an ACF plot of the data is contained as the first diagram in the fifth image attachment.

The time series here has an increasing trend which means that it could not possibly be stationary. The ACF plot is difficult to interpret since the data is not stationary. It cannot be interpreted as an approximation to the auto-correlation function.

e. A plot of the differenced data and its corresponding ACF plot is find as the second diagram in the fifth image attachment.

The data here seems to be fairly stationary after differencing with a fairly constant variance and no discernible trend.

f. The model after the differencing would be


x_1 - x_(t-1) = \beta _1 + w_t - w_(t-1)

The differenced data has a constant mean. This corresponds to the ACF plot in the previous question. The ACF at lag one is negative and significantly outside the confidence intervals. The other lags show no or weak dependency

The file "berkeley.dat" contains average yearly temperatures for the cities-example-1
The file "berkeley.dat" contains average yearly temperatures for the cities-example-2
The file "berkeley.dat" contains average yearly temperatures for the cities-example-3
The file "berkeley.dat" contains average yearly temperatures for the cities-example-4
The file "berkeley.dat" contains average yearly temperatures for the cities-example-5
User Gyrolf
by
7.0k points