Jan 27

Sampling Distribution

The sampling distribution refers to the distribution of sample means over a range of samples drawn from a population.
For this case, it is given that the population mean is 40%.

The formula for standard deviation of the statistic is:
- Standard Deviation = √(pq/n)
Where:
- p = population proportion (0.4 in this example)
- q = 1 - p (0.6 in this example)
- n = sample size
In this instance, the standard deviation calculated is 15.5%, assuming n = 100.
If the sample size (n) is increased to 20, the standard deviation decreases.

The maximum standard deviation possible occurs when p = 0.5:
- Maximum Standard Deviation = 1.5 * √(0.5 * 0.5)
This results in a maximum of 0.5, indicating that the standard deviation can only go down from this point.

p stands for a certain parameter that is being described:
- Examples: The average height of men aged 18-24 in the U.S. or the proportion of girls with blue eyes.
It's important to define what mean or proportion you are working with.

Important assumptions must be met prior to performing statistical tests:
- Independence: Refers to whether individuals in a sample are independent from each other.
  - Example: A sample must be a simple random sample and should not exceed 10% of the population.
- Sample Size: Should be sufficiently large enough to assume normality using the Central Limit Theorem.
  - For proportions, check that np ≥ 10 and nq ≥ 10.
For the sampling distribution to be normal, an adequate sample size is essential.

The test statistic often involves calculating a z-score:
- Formula:
  - Z = (p hat - p) / √(pq/n)
Obtain the p-value to determine the probability of observing a sample statistic as extreme as what was observed.
- This indicates the probability that the test statistic is greater than, less than, or not equal to a certain value.

The study assesses whether the anti-cholesterol drug Gimfimrazole reduces heart attacks. Sample included:
- 2,000 men receiving the drug and 2,000 men receiving a placebo.
- Probability of heart attack in participants = 4% (0.04).
Calculation: What is the probability of at least 75 heart attacks in the treatment group?
- The hypothesis involves checking if the sample is from a simple random sample and ensuring independence.
- Since it’s a proportion problem, the absence of a standard deviation indicates the need for further calculations.

As we proceed through statistical tests and theories, it’s crucial to continuously validate assumptions and recalculate as necessary to ensure accurate results.
Lastly, maintaining an organized approach in assessments, including set reminders and familiar formulas, can simplify the testing process.