199k views
1 vote
2. A random sample of 29 employees of a large company has their systolic blood pressure checked. Summary statistics are provided in the table below. Assume that the systolic blood pressure of all U.S. adults follows a normal distribution with a mean of 122 mm Hg and a standard deviation of 20 mm Hg.

(a) Approximately what percent of all U.S. adults have systolic blood pressure greater than 142 mm Hg?

(b) Describe the distribution of systolic blood pressure for the 29 employees of this company that was sampled.

(c) The company CEO wants to know if the mean systolic blood pressure of employees at her company is higher than the national average. State the hypotheses for testing this concern.

(d) The conditions for the hypothesis test in part (c) were satisfied. The hypothesis test resulted in a t-score of 5.495 and a p-value of 3.591 × 10−6. Interpret the p-value in the context of this hypothesis test. What would this p-value lead you to conclude?

(e) Explain in context what it would mean to make a type II error for the hypothesis test in part (c).


In addition to recording the systolic blood pressure of the 29 employees at the company, their ages were also recorded. A linear regression model was fit to these data. Graphical and numerical summaries of this analysis are given below. Use this information to answer the questions that follow. (Image is attached below)


(f) Interpret the slope of the regression line in this context.

(g) Comment on the strength, direction, and form of the relationship between age and systolic blood pressure.

2. A random sample of 29 employees of a large company has their systolic blood pressure-example-1
User Prelite
by
4.9k points

1 Answer

7 votes

Answer:

Explanation:

Hello!

The variable of interest is

X: Systolic blood pressure of a U.S. adult. (mmHg)

X~N(μ;σ²)

μ= 122 mmHg

σ= 20 mmHg

For all calculations you have to work under the standard normal distribution, because it is tabulated. It is easier and faster to standardize or "translate" all values of X into values of Z and look for the corresponding probabilities in the table than to manually calculate them using the density function of the normal distribution.

The standard normal distribution is derived from the normal distribution. Considering a random variable X with normal distribution, mean μ and variance δ², the variable Z =(X-μ)/δ ~N(0;1) is determined.

Any value of any random variable X with normal distribution can be "converted" by subtracting the variable from its mean and dividing it by its standard deviation.

Since the distribution is centered in zero, there are two entries for its table, the left entry shows the cumulated probabilities corresponding to negative values of Z: P(Z≤z)=α and the right entry show the cumulated probabilities corresponding to positive values of Z: P(Z≤z)= 1 - α

a)

You need to calculate the percentage/ proportion of U.S. adults that have a systolic pressure greater than 142 mmHg, symbolically:

P(X>142) = 1 - P(X≤142)

To standardize this value od the variable you have to do the following calculation:

Z =(X-μ)/δ= (142-122)/20= 1

Now you look in the right entry for the cumulative probability until z= 1.00

(Remember, the first column of the table shows you the integer and first digit, the first row of the table shows you the second integer of the z value)

P(Z≤1)= 0.84134

Now you calculate the asked value:

P(X>142) = 1 - P(X≤142)= 1 - P(Z≤1)= 1 - 0.84134= 0.15866

b)

The sample mean is derived from a random variable with a normal distribution it shares that distribution with the exception that its variance is directly affected by the sample size:

X[bar]~N(μ;σ²/n)

μ= 122 mmHg

σ/√n= 20/√29= 3.71 mmHg

c)

The claim is that the mean systolic pressure of the employees is higher than the national average, symbolically μ> 122

The statistical hypotheses are:

H₀: μ≤ 122

H₁: μ> 122

d)

t= 5.495 p-value: 0.000003591

In this example the test statistic depends on the mean and the p-value is 3.591*10⁻⁶. This value indicates that 0.0003591% of the samples with size n=29 taken from a population with mean 122 mmHg, will produce a mean that provides evidence as (or stronger) than the current sample that μ is not at most 122 mmHg.

e)

The type II error is the scenario when you fail to reject the null hypothesis when the hypothesis is false. In this case, it is to fail to reject that the average systolic pressure of the company's employees is at most as the national average.

f)

The variable "X: Age of an employee" was recorded and linear regression of the systolic blood pressure as a function of the employee's ages estimated.

^Y= 97.0771 + 0.9493Xi

0.9493mmHg/years is the modification of the estimated average systolic blood pressure of the company's employees when their age increases one year.

g)

To determine the type of linear regression between the two variables, you have to analyze the slope of the equation:

As you can see in the graphic, the slope of the regression is positive, which means there is a positive regression between the systolic pressure and the age of the employees. I.e. each time the age of the employee increases, his systolic pressure also increases.

To determine the strength of the regression between these two variables you have to analyze the coefficient of determination R²:

The coefficient of determination gives you an idea of how much of the variability of the dependent variable (Y) is due to the explanatory variables under the estimated regression. It takes values between 0 to 1 or 0 to 100% if expressed in percentage. The closer the coefficient is to zero, the weaker the relationship between these two variables.

The closer the coefficient is to 100%, the stronger the relationship between these two variables.

R²= 0.712

71.2% of the variability of the systolic pressure is explained by the age fo the employees under this estimated model ^Y= 97.0771 + 0.9493Xi

The relationship between these variables is strong enough to consider the regression.

I hope this helps!

User Sreekanth Pothanis
by
4.6k points