Answer:
Explanation:
Hello!
The comitee hypothesizes that the average time the students spend in the lab is greater than 54 min, symbolically: μ > 54
The study variable is X: Time a student spends in the lab. (min)
a.
Remember, an outlier is an observation that is significantly distant from the rest of the data set. They usually represent experimental errors (such as a measurement) or atypical observations. Some statistical measurements, such as the sample mean, are severly affected by tis type of values and their precense tends to cause misleading results on a statistical analysis.
In the given sample two values are distant from the rest of the sample, 7 and 137.
b.
Considering what i've said before, the statistical hypotheses to test are:
H₀: μ ≤ 54
H₁: μ > 54
There is no level of significance for the test, I'll use 5% since is one of the most common numbers, but remember that the decision of the test may change with another levels.
α: 0.05
To use the Student's t statistic you need a variable with normal distribution, so I've runned a normality test on the sample. The study variable has a normal distribution at level 5% (p-value 0.1305)
t= X[bar] - μ ~ t₍ₙ₋₁₎
S/√n
Usign the complete sample I've obtained the following values
n= 12
Sample mean X[bar]= 63.75 min
Sample standard deviation S= 29.26 min
t= X[bar] - μ = 63.75 - 54 = 1.15
S/√n 29.26/√12
The p-value is 0.1364
P(t₁₁≥ 1.15)= 1 - P(t₁₁< 1.15)= 1 - 0.8636= 0.1364
The p-value is greater than α, so the decision is to not reject the null hypothesis. This means that the average time that students spend in the lab is not greater than 54 min.
Now I'll use the same sample but without the outliers, the statistical hypotheses and the level of significance don't change.
n= 10
Sample mean X[bar]= 62.10
Sample standard deviation S= 9.46
t= X[bar] - μ = 62.10 - 54 = 2.71
S/√n 9.46/√10
The p-value is 0.012
P(t₉≥ 2.71)= 1 - P(t₉< 2.71)= 1 - 0.9880= 0.012
Without the outliers, the obtained p-value is less than the level of significance, so in this case, the decision is to reject the null hypothesis. That means that the population mean of the time the students spend on the lab is greater than 54min.
; With outliers ; Without outliers
n ; 12 10
X[bar] ; 63.75 62.10
S ; 29.26 9.46
As you can see when comparing the values, the outliers do not only affect the value of the mean, they also affect greatly on the standard deviation giving a false idea that the data set has a greater dispersion that it does and also leading to not reject a null hypothesis that would otherwise be rejected.
I hope it helps!