Final answer:
The mean of residuals in linear regression with an intercept is usually zero, while the median indicates the direction and degree of skewness in the residuals' distribution. The difference between the mean and median of residuals highlights the skewness of the data, affecting the interpretation of central tendency and statistical analyses.
Step-by-step explanation:
To compute the mean and median of the residuals, one must consider that the mean of the residuals in a linear regression is always zero if the model includes an intercept term. This occurs because residuals are the differences between the observed and predicted values, which sum to zero around the line of best fit. However, the median of the residuals, which is the middle value when the residuals are ordered, can provide insight into the skewness of the residuals' distribution.
What the difference between the mean and the median indicates is the degree and direction of skewness. If the distribution of data is skewed to the left (negatively skewed), the mean will be less than the median. In contrast, if the distribution is skewed to the right (positively skewed), the mean will be greater than the median. Median is not affected by extreme values as much as mean, and therefore, remains closer to the center in a skewed distribution. In symmetrical distributions, such as a normal distribution, both the mean and median are centered and close to the high point of the distribution, often coinciding with the mode.
Understanding the skewness is important when analyzing data because it affects the interpretation of the central tendency measures, as skewness can influence the calculation of the mean. It is crucial to identify skewness, as it may impact the results of statistical tests and the generalizability of the data to broader contexts.