Final answer:
To screen for outliers, one can visually use a scatter plot with lines two standard deviations from the best-fit line or numerically check if residuals are over twice the standard deviation. If an outlier is found, it must be closely examined to determine if it is an error to be corrected or a valid point that should remain. This affects regression analysis, the correlation coefficient, and the model fit.
Step-by-step explanation:
When screening for outliers in data, it's important to identify whether a data point is significantly different from the others, which could indicate an error, an abnormality, or a crucial part of the population under study. Visually, this can be done by plotting data on a scatter plot and including lines that are two standard deviations above and below the best-fit line, with any points outside these lines considered potential outliers. Numerically, examining if the residuals are greater than twice the standard deviation provides a clue into potential outliers.
Once identified, the cause of an outlier should be investigated. It could be a data entry error or a true data point that signifies a rare event. If it is an error, it should be corrected or removed. However, if it is a valid data point, it might remain to preserve information about the dataset's variance. Decisions on whether to keep or remove an outlier can significantly affect the regression analysis results, as evidenced by changes in the correlation coefficient and the fit of the regression line.
For instance, using a set of numbers like 3, 4, 5, 7, and 9, we can evaluate outliers using the Interquartile Range (IQR) or by checking if any values are more than two standard deviations from the mean in mound-shaped and symmetric data. Tools like the TI-83, 83+, or 84+ calculators simplify this process by allowing graphical identification of outliers.