Answer: D) it increases
=========================================================
Step-by-step explanation:
Let's add up the values we're given to get
22 + 20 + 23 + 2 + 21 + 25 + 31 + 28 = 172
Then divide by 8, since there are 8 values in this set, leading to 172/8 = 21.5
The mean of the original set is 21.5
We'll come back to this later.
------------------
If you sort the values of {22, 20, 23, 2, 21, 25, 31, 28} you would get the list {2, 20, 21, 22, 23, 25, 28, 31}
Cross off the first and last number to reduce things to this smaller set {20, 21, 22, 23, 25, 28}
Repeat again, cross off the first and last number, and we now have {21, 22, 23, 25}
Do that one more time to be left with {22,23}
The median is the value at the midpoint of 22 and 23. Add these up and divide by two: (22+23)/2 = 45/2 = 22.5
The median is 22.5
The items lower than the median are L = {2, 20, 21, 22} while the items in the upper set above the median are U = {23, 25, 28, 31}.
You should find the median of set L is 20.5 and the median of set U is 26.5
Therefore Q1 and Q3 are 20.5 and 26.5 respectively
Q1 = 20.5
Q3 = 26.5
Use these two values to get the interquartile range (IQR)
IQR = Q3 - Q1
IQR = 26.5 - 20.5
IQR = 6
Then we can compute the lower fence which I'll call m
lower fence = Q1 - 1.5*IQR
lower fence = 20.5 - 1.5*6
lower fence = 11.5
We can see the data value "2" is smaller than the lower fence 11.5; anything below the lower fence is considered an outlier.
------------------
All of the previous section was to find if we had an outlier, and identify what that outlier is. We found the single outlier is 2.
We could have probably done all of that by simply noticing that 2 is far from the main cluster of other values in the 20s; however, sometimes picking the outlier isn't that straightforward. So it's good practice to use the method described above.
Anyways, we'll remove the outlier "2" and we're left with this set {22,20,23,21,25,31,28}
Add up those values to get
22 + 20 + 23 + 21 + 25 + 31 + 28 = 170
Then divide that by 7, since we tossed out one value, and we get
170/7 = 24.2857 approximately
This is new mean.
------------------
To summarize:
- The mean of the original set is exactly 21.5
- The mean of the set after we remove the outlier is roughly 24.2857
We can see that the mean has increased. The very small outlier pulls on the mean to get smaller. Removing said outlier will then increase the mean so that it gravitates toward the main cluster. The mean is the center so if you remove the far outlier, then the center gravitates toward where the other values are located.