55.9k views
0 votes
You survey students in your class about the number of movies they watched last month. The results are shown in the table. 7,5,14,5,6,9,10,12,15,4,5,8,11,10,9,2 A new student joins the class who watched 21 movies last month. Is 21 an outlier? How does including this value affect the measures of center and the measures of variation? Explain. Round your answers to the nearest hundredth, if necessary.

User JosephA
by
7.9k points

2 Answers

3 votes

The data set

{7,5,14,5,6,9,10,12,15,4,5,8,11,10,9,2}

sorts to

{2,4,5,5,5,6,7,8,9,9,10,10,11,12,14,15}

There are n = 16 items in that set. The middle-most item is between slot n/2 = 16/2 = 8 and slot 9.

The value in slot 8 is 8

The value in slot 9 is 9

The midpoint is (8+9)/2 = 17/2 = 8.5 which is the median.

The mean is found by adding up all of the values and dividing by n = 16.

(2+4+5+5+5+6+7+8+9+9+10+10+11+12+14+15+21)/16 = 8.25

We have

mean = 8.25

median = 8.5

mean < median indicates the data is slightly skewed left.

------------------------

Let's break the sorted set into two smaller halves

L = lower half

L = stuff smaller than the median

L = {2,4,5,5,5,6,7,8}

U = upper half

U = stuff larger than the median

U = {9,9,10,10,11,12,14,15}

The median of set L is 5, which is the value of Q1 aka first quartile.

The median of set U is 10.5 aka Q3.

The min and max are the smallest and largest items (2 and 15 currently).

Here's the five-number-summary of the original data set

min = 2

Q1 = 5

median = 8.5 (aka Q2)

Q3 = 10.5

max = 15

Let's compute the interquartile range

IQR = Q3 - Q1 = 10.5 - 5 = 5.5

Then,

LF = lower fence = Q1 - 1.5*IQR = 5 - 1.5*5.5 = -3.25

UF = upper fence = Q3 + 1.5*IQR = 10.5 + 1.5*5.5 = 13.25

Rule: If a value is between LF and UF, then it is not an outlier. Otherwise we have an outlier.

There aren't any values smaller than LF since 2 is the min. But there are two values larger than UF. Those two values are 14 and 15.

Therefore, 14 and 15 are considered outliers in the original set.

------------------------

What happens when we introduce 21 to the set?

Sorted set = {2,4,5,5,5,6,7,8,9,9,10,10,11,12,14,15,21}

Here's the five-number-summary

min = 2

Q1 = 5

median = 9 (aka Q2)

Q3 = 11.5

max = 21

The IQR this time is 11.5-5 = 6.5, and,

LF = lower fence = Q1 - 1.5*IQR = 5 - 1.5*6.5 = -4.75

UF = upper fence = Q3 + 1.5*IQR = 11.5 + 1.5*6.5 = 14.75

In this case, 14 is no longer an outlier. It's just barely to the left of UF. The value 15 is still an outlier and 21 is most definitely an outlier as well.

As you can see, the new outlier 21 pulls the measure of center to the right. Think of it like a pull of a magnet. Also, the 21 spreads the data out further so the measure of variation increases.

Side note: if you need to calculate the standard deviation, I recommend using a calculator that has that specific function.

User Woodvi
by
8.3k points
3 votes

The data set

{7,5,14,5,6,9,10,12,15,4,5,8,11,10,9,2}

sorts to

{2,4,5,5,5,6,7,8,9,9,10,10,11,12,14,15}

There are n = 16 items in that set. The middle-most item is between slot n/2 = 16/2 = 8 and slot 9.

The value in slot 8 is 8

The value in slot 9 is 9

The midpoint is (8+9)/2 = 17/2 = 8.5 which is the median.

The mean is found by adding up all of the values and dividing by n = 16.

(2+4+5+5+5+6+7+8+9+9+10+10+11+12+14+15+21)/16 = 8.25

We have

  • mean = 8.25
  • median = 8.5

mean < median indicates the data is slightly skewed left.

------------------------

Let's break the sorted set into two smaller halves

L = lower half

L = stuff smaller than the median

L = {2,4,5,5,5,6,7,8}

U = upper half

U = stuff larger than the median

U = {9,9,10,10,11,12,14,15}

The median of set L is 5, which is the value of Q1 aka first quartile.

The median of set U is 10.5 aka Q3.

The min and max are the smallest and largest items (2 and 15 currently).

Here's the five-number-summary of the original data set

  • min = 2
  • Q1 = 5
  • median = 8.5 (aka Q2)
  • Q3 = 10.5
  • max = 15

Let's compute the interquartile range

IQR = Q3 - Q1 = 10.5 - 5 = 5.5

Then,

  • LF = lower fence = Q1 - 1.5*IQR = 5 - 1.5*5.5 = -3.25
  • UF = upper fence = Q3 + 1.5*IQR = 10.5 + 1.5*5.5 = 13.25

Rule: If a value is between LF and UF, then it is not an outlier. Otherwise we have an outlier.

There aren't any values smaller than LF since 2 is the min. But there are two values larger than UF. Those two values are 14 and 15.

Therefore, 14 and 15 are considered outliers in the original set.

------------------------

What happens when we introduce 21 to the set?

Sorted set = {2,4,5,5,5,6,7,8,9,9,10,10,11,12,14,15,21}

Here's the five-number-summary

  • min = 2
  • Q1 = 5
  • median = 9 (aka Q2)
  • Q3 = 11.5
  • max = 21

The IQR this time is 11.5-5 = 6.5, and,

  • LF = lower fence = Q1 - 1.5*IQR = 5 - 1.5*6.5 = -4.75
  • UF = upper fence = Q3 + 1.5*IQR = 11.5 + 1.5*6.5 = 14.75

In this case, 14 is no longer an outlier. It's just barely to the left of UF. The value 15 is still an outlier and 21 is most definitely an outlier as well.

As you can see, the new outlier 21 pulls the measure of center to the right. Think of it like a pull of a magnet. Also, the 21 spreads the data out further so the measure of variation increases.

Side note: if you need to calculate the standard deviation, I recommend using a calculator that has that specific function.

User Lupl
by
8.5k points

No related questions found