174,734 views
26 votes
26 votes
Which histogram represents the data set with the smallest standard deviation? A histogram titled Squad 1 has x-ring hits on the x-axis and frequency on the y-axis. 0 to 1, 2; 1 to 2, 1; 2 to 3, 3; 3 to 4, 6; 4 to 5, 8. A histogram titled Squad 2 has x-ring hits on the x-axis and frequency on the y-axis. 0 to 1, 8; 1 to 2, 5; 2 to 3, 4; 3 to 4, 2; 4 to 5, 1. A histogram titled Squad 3 has x-ring hits on the x-axis and frequency on the y-axis. 0 to 1, 1; 1 to 2, 5; 2 to 3, 7; 3 to 4, 5; 4 to 5, 2.

User SurenSaluka
by
3.1k points

1 Answer

22 votes
22 votes

Answer:

Squad 3

Explanation:

The standard deviation of a population can be computed as the root of the variance. The variance will be the mean square less the square of the mean. When given a histogram, we don't know the population values, so we can only use a representative value (rep value) to stand in for all of the data values that fall in a given range.

__

The attached spreadsheet does the calculation just described. It shows that Squad 3 has the least standard deviation of x-ring hits.

__

example calculation

Here's an example of the calculation of the sum of squares. (Squad 1 data)

∑squares = 2(0.5^2) +1(1.5^2) +3(2.5^2) +6(3.5^2) +8(4.5^2)

= 0.50 +2.25 +18.75 + 73.50 +1642.00 = 257.00

Similarly, the sum of values is ...

∑values = 2(0.5) +1(1.5) +3(2.5) +6(3.5) +8(4.5)

= 1.0 +1.5 +7.5 +21.0 +36.0 = 67.0

Then the variance is ...

(257/20) -(67/20)^2 = 12.85 -11.2225 = 1.6275

Division by 20 gives the mean in each case, since there are a total of 20 values in each of the sets of data.

The standard deviation is ...

√1.6275 ≈ 1.2757 ≈ 1.28 . . . . . for Squad 1

_____

Additional comments

Note that we have computed the population standard deviation, assuming the values given represent the entire population of data values. For the purpose here, that is irrelevant. The ratio of "population" standard deviation to "sample" standard deviation is √((n-1)/n), where the sample size is n. Since the ratio is constant, the determination of "smallest" is not affected.

__

Ordinarily a histogram would be designed with bin limits that allow for unambiguous determination of the bin a data value should be assigned to. Here, we suspect that "x-ring hits" will be an integer from 0 to 5, so the identification of bins as "0 to 1" and "1 to 2" means the destination of data value 1 is ambiguous. Clearly, using a non-integer to represent the values in the bin means they are not represented exactly.

In defense of that rep value decision, adding or subtracting a constant to the values of a data set changes the mean, but not the standard deviation. In other words, the actual "rep value" is unimportant, as long as it has a consistent relationship to the data values in the bin.

__

For small data sets with a relatively large ratio between largest and smallest values, the mode may not be a very good representation of the mean. However we note that Squads 1 and 2 have distributions that are skewed left and right, respectively, in relation to the mode. On the other hand, Squad 3 has a fairly symmetrical distribution about the mode, which is centered in the range. This gives a qualitative suggestion that Squad 3 will have a smaller standard deviation than the others.

Which histogram represents the data set with the smallest standard deviation? A histogram-example-1
User Neri Barakat
by
3.0k points