When dealing with the distribution of data, it is important to consider the shape of the distribution to determine the most appropriate descriptive statistics to summarize the data. For the distribution of the population of cities in the US, which is said to be skewed to the right (positively skewed), certain statistical measures will provide a more accurate and informative representation of the data’s center and spread than others. Let's evaluate the options:
A) The range gives us the difference between the largest and smallest values in the dataset. While it provides some information about the spread of the data, it doesn't tell us anything about the center, and it can be heavily influenced by outliers, which makes it less useful for skewed distributions.
B) The 5-number summary includes the minimum value, the first quartile (Q1), the median, the third quartile (Q3), and the maximum value. These five numbers can give a clear picture of both the center and spread of the data, as they describe the distribution in parts. The median gives a measure of the center that is not influenced by the size of any outliers, while the quartiles provide information about the spread that reflects the skew of the data.
C) The mean and standard deviation (SD) are measures that are highly sensitive to skewness and outliers. While the mean gives the arithmetic average of the data, the SD provides a measure of how much the data varies from the mean. In a skewed distribution, the mean can be pulled in the direction of the tail, and the SD can be inflated by extreme values, giving a misleading picture of the distribution.
D) The mean, median, and mode are measures of central tendency. In a skewed distribution, these three measures will typically differ from each other, with the mean being pulled toward the skew, and the mode reflecting the peak of the distribution. They don't on their own provide robust information about the spread of the data and can be misleading when used to describe skewed distributions.
Given the explanation above, the best summary of the data's center and spread for a right-skewed distribution like that of the populations of cities in the US is:
B) the 5-number summary.
The 5-number summary is the most informative as it is resistant to skewness and outliers, providing a more accurate depiction of a skewed data distribution.