139k views
2 votes
Define the function diabetes test statistic which should return exactly one simulated statistic of the absolute distance between the observed prevalence and the true population prevalence under the null hypothesis. Make sure that your simulated sample is the same size as your original sample. Hint: The array diabetes proportions contains the proportions of the population without and with diabetes, respectively

User Chemark
by
7.8k points

1 Answer

3 votes

Here's an example definition of the function diabetes_test_statistic:

```

import numpy as np

def diabetes_test_statistic(sample, proportions):

# Calculate the observed prevalence of diabetes in the sample

observed_prevalence = np.mean(sample)

# Simulate a new sample from the null hypothesis

null_sample = np.random.choice([0, 1], size=len(sample), p=proportions)

# Calculate the prevalence of diabetes in the null sample

null_prevalence = np.mean(null_sample)

# Calculate the absolute difference between the observed and null prevalences

test_statistic = np.abs(observed_prevalence - null_prevalence)

return test_statistic

```

This function takes two arguments: `sample`, which is the original sample of patients, and `proportions`, which is an array containing the true population proportions of patients without and with diabetes. The function first calculates the observed prevalence of diabetes in the sample by taking the mean of the values in the sample array. It then simulates a new sample from the null hypothesis by randomly sampling from the two proportions in the `proportions` array. It calculates the prevalence of diabetes in the null sample and then calculates the absolute difference between the observed and null prevalences. Finally, it returns the test statistic, which is the absolute difference between the observed and null prevalences.

User Jack Chorley
by
8.1k points