Inference is the process of using sample data to make conclusions about a population.
The relationship between the population and the sample is defined as follows:
- Population Truth: Represented by the parameter p (the true percentage).
- Sampling: Data is collected from the population using random sampling methods.
- Sample Statistic: The result of the sample is represented by p^.
- Sample Size: Represented by n.
There are two primary methods to infer the truth about a population:
- Hypothesis Test: Testing a claim about a population parameter (H_0: p = \text{#}) to find a P-value.
- Confidence Interval: If the null hypothesis (H0) is rejected, we may try to estimate the true proportion (p) using a confidence interval.
The One-Proportion Z-Interval
Definition: A confidence interval tries to infer the true population proportion (parameter) by creating a numeric interval and assigning a percent qualifier known as confidence.
Applicability: This method is used when specific conditions (randomness, independence, and sample size) are met to find the confidence interval for the population proportion, p.
General Formula: The structure of the confidence interval is modeled as:
CI=p^±z∗×SE(p^)
Components of the Formula:
- Center of the Interval (Statistic): The sample proportion p^.
- Critical Value (z∗): A value picked based on the desired level of confidence.
- Standard Error (SE(p^)): This is the estimated standard deviation of the proportion calculated from a sample.
Standard Error (SE) and Margin of Error (ME)
Standard Error (SE(p^)): Because we do not know the true population parameter, we use the sample variation to estimate the standard deviation. The formula is:
SE(p^)=np^q^
- Note: q^ is the complement of the sample proportion (1−p^).
Margin of Error (ME): This represents the range above and below the sample statistic. It accounts for sample variation (it is an expected variation, not a mistake). The formula is:
ME=z∗×np^q^
Visual Representation of the Interval:
p^−ME←p^→p^+ME
Critical Values (z*) for Common Confidence Levels
The critical value (z<em>) is determined by the specific level of confidence chosen for the interval. The most commonly used values are:
- 90% Confidence: z</em>=1.645
- 95% Confidence: z∗=1.96
- 98% Confidence: z∗=2.326
- 99% Confidence: z∗=2.576