138k views
2 votes
A survey of over 25,000 Americans aged between 18 and 24 years revealed the following: 88.1% of the 12,678 females and 84.9% of the 12,460 males had high school diplomas.

a) Do the data suggest that females are more likely to graduate from high school than males? Test at a significance level of 5%.
b) Set-up a 95% confidence interval for the difference in the graduation rates between females and males.
c) State the assumptions and conditions necessary for the above inferences to hold.

1 Answer

2 votes

Answer:

(a) Yes, the data suggest that females are more likely to graduate from high school than males.

(b) A 95% confidence interval for the difference in the graduation rates between females and males is [0.024, 0.404] .

Explanation:

We are given that a survey of over 25,000 Americans aged between 18 and 24 years revealed the following: 88.1% of the 12,678 females and 84.9% of the 12,460 males had high school diplomas.

Let
p_1 = population proportion of females who had high school diplomas.


p_2 = population proportion of males who had high school diplomas.

(a) So, Null Hypothesis,
H_0 :
p_1\leq p_2 {means that females are less or equally likely to graduate from high school than males}

Alternate Hypothesis,
H_A :
p_1 > p_2 {means that females are more likely to graduate from high school than males}

The test statistics that will be used here is Two-sample z-test statistics for proportions;

T.S. = ~ N(0,1)

where,
\hat p_1 = sample proportion of females having high school diplomas = 88.1%


\hat p_2 = sample proportion of males having high school diplomas = 84.9%


n_1 = sample of females = 12,678

= sample of males = 12,460

So, the test statistics =

= 7.428

The value of the standardized z-test statistic is 7.428.

Now, at a 5% level of significance, the z table gives a critical value of 1.645 for the right-tailed test.

Since the value of our test statistics is more than the critical value of z as 7.428 > 1.645, so we have sufficient evidence to reject our null hypothesis as it will fall in the rejection region.

Therefore, we conclude that females are more likely to graduate from high school than males.

(b) Firstly, the pivotal quantity for finding the confidence interval for the difference in population proportion is given by;

P.Q. =
\frac{(\hat p_1-\hat p_2)-(p_1-p_2)}{\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } ~ N(0,1)

where,
\hat p_1 = sample proportion of females having high school diplomas = 88.1%


\hat p_2 = sample proportion of males having high school diplomas = 84.9%


n_1 = sample of females = 12,678


n_2 = sample of males = 12,460

Here for constructing a 95% confidence interval we have used a Two-sample z-test statistics for proportions.

So, 95% confidence interval for the difference in population proportions, (
p_1-p_2) is;

P(-1.96 < N(0,1) < 1.96) = 0.95 {As the critical value of z at 2.5% level

of significance are -1.96 & 1.96}

P(-1.96 <
\frac{(\hat p_1-\hat p_2)-(p_1-p_2)}{\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } < 1.96) = 0.95

P(
-1.96 * {\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } <
{(\hat p_1-\hat p_2)-(p_1-p_2)} <
1.96 * {\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } ) = 0.95

P(
(\hat p_1-\hat p_2)-1.96 * {\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } < (
p_1-p_2) <
(\hat p_1-\hat p_2)+1.96 * {\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } ) = 0.95

95% confidence interval for (
p_1-p_2) = [
(\hat p_1-\hat p_2)-1.96 * {\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } ,
(\hat p_1-\hat p_2)+1.96 * {\sqrt{(\hat p_1(1-\hat p_1))/(n_1)+(\hat p_2(1-\hat p_2))/(n_2)} } ]

= [
(0.881-0.849)-1.96 * {\sqrt{(0.881(1-0.881))/(12,678)+(0.849(1-0.849))/(12,460)} } ,
(0.881-0.849)+1.96 * {\sqrt{(0.881(1-0.881))/(12,678)+(0.849(1-0.849))/(12,460)} } ]

= [0.024, 0.404]

Therefore, a 95% confidence interval for the difference in the graduation rates between females and males is [0.024, 0.404] .

(c) The assumptions and conditions necessary for the above inferences to hold are;

  • The data must follow the normal distribution.
  • The sample must be taken from the population data only or the sample represents the population data.
User Sachin Kumar
by
4.7k points