Lecture 7(1)
In this chapter, we integrate concepts from previous chapters:
Population and sample statistics (Chapters 1-3)
Probability distributions (Chapters 4-6)
Goal: Make inferences about populations using samples.
Element: Entity from which data is collected.
Population: Entire collection of elements of interest.
Sample: Subset of the population.
Sampled Population: The population from which a sample is drawn.
Frame: List of elements for selecting a sample.
Samples are selected to gather data answering research questions about a population.
Sample results yield estimates of population characteristics; they may be good approximations with proper methodologies.
Defined by lists such as membership rosters or inventory numbers.
Simple Random Sample: Each possible sample of size n has the same probability of selection.
With Replacement: Each element can be selected more than once.
Without Replacement: More common; each element can only appear once in the sample.
Random numbers often aid in sample selection during large projects.
900 applicants for admission to St. Andrew's College:
Assign random numbers to each applicant.
Select 30 smallest random numbers for the sample.
Situations where obtaining a full list of elements isn’t feasible.
Examples: ongoing manufacturing processes, bank transactions.
Random samples need to fulfill the conditions:
Each element is from the population of interest.
Element selection is independent.
Statistical Inference: Inferring population characteristics from sample data.
Point Estimation: Inferring population parameters using sample statistics.
Point estimates are single values (e.g., (\bar{x}) for population mean (\mu)).
When does a sample statistic accurately estimate population parameters?
When does it produce precise estimates?
What methods can facilitate these assessments?
Estimating average SAT and housing preferences of 900 applicants using a sample of 30.
Calculated point estimates:
(\bar{x} = 1684) - Sample mean.
(s = 85.2) - Sample standard deviation.
(\bar{p} = 0.67) - Sample proportion wanting on-campus housing.
Population mean SAT score: (\mu = 1697)
Population standard deviation: (s = 87.4)
Proportion wanting on-campus housing: (p = 0.72)
Sampling distribution of (\bar{x}): Probability distribution of all possible sample means from repeated sampling.
Key features:
Expected Value: (E[\bar{x}] = \mu)
Unbiased if the expected value equals the population parameter.
For large sample sizes, the sampling distribution approximates a normal distribution:
Holds true regardless of the population's distribution.
For most scenarios, n (\geq 30) ensures a normal approximation.
Problem: Calculate the probability that an estimate from 30 applicants falls within +/- 10 of the actual mean SAT score (estimate between 1687 and 1707).
Standard error of mean computed, z-scores calculated for the probabilities.
Sampling distribution of (\bar{p}): Probability distribution for the sample proportion.
Expected value: (E[\bar{p}] = \rho)
For St. Andrew’s College, estimating the probability that a sample of 30 students reflects the true proportion of those wanting housing within +/- 0.05 of population proportion (0.72).
Unbiasedness: (E[\hat{\theta}] = \theta) (estimate equals population parameter).
Consistency: As sample size increases, estimator approaches the true parameter value.
Efficiency: Among unbiased estimators, preferred based on lowest variance.
Analyze multiple point estimators by their unbiasedness, consistency, and efficiency to find the best estimator for a given parameter.
In this chapter, we integrate concepts from previous chapters:
Population and sample statistics (Chapters 1-3)
Probability distributions (Chapters 4-6)
Goal: Make inferences about populations using samples.
Element: Entity from which data is collected.
Population: Entire collection of elements of interest.
Sample: Subset of the population.
Sampled Population: The population from which a sample is drawn.
Frame: List of elements for selecting a sample.
Samples are selected to gather data answering research questions about a population.
Sample results yield estimates of population characteristics; they may be good approximations with proper methodologies.
Defined by lists such as membership rosters or inventory numbers.
Simple Random Sample: Each possible sample of size n has the same probability of selection.
With Replacement: Each element can be selected more than once.
Without Replacement: More common; each element can only appear once in the sample.
Random numbers often aid in sample selection during large projects.
900 applicants for admission to St. Andrew's College:
Assign random numbers to each applicant.
Select 30 smallest random numbers for the sample.
Situations where obtaining a full list of elements isn’t feasible.
Examples: ongoing manufacturing processes, bank transactions.
Random samples need to fulfill the conditions:
Each element is from the population of interest.
Element selection is independent.
Statistical Inference: Inferring population characteristics from sample data.
Point Estimation: Inferring population parameters using sample statistics.
Point estimates are single values (e.g., (\bar{x}) for population mean (\mu)).
When does a sample statistic accurately estimate population parameters?
When does it produce precise estimates?
What methods can facilitate these assessments?
Estimating average SAT and housing preferences of 900 applicants using a sample of 30.
Calculated point estimates:
(\bar{x} = 1684) - Sample mean.
(s = 85.2) - Sample standard deviation.
(\bar{p} = 0.67) - Sample proportion wanting on-campus housing.
Population mean SAT score: (\mu = 1697)
Population standard deviation: (s = 87.4)
Proportion wanting on-campus housing: (p = 0.72)
Sampling distribution of (\bar{x}): Probability distribution of all possible sample means from repeated sampling.
Key features:
Expected Value: (E[\bar{x}] = \mu)
Unbiased if the expected value equals the population parameter.
For large sample sizes, the sampling distribution approximates a normal distribution:
Holds true regardless of the population's distribution.
For most scenarios, n (\geq 30) ensures a normal approximation.
Problem: Calculate the probability that an estimate from 30 applicants falls within +/- 10 of the actual mean SAT score (estimate between 1687 and 1707).
Standard error of mean computed, z-scores calculated for the probabilities.
Sampling distribution of (\bar{p}): Probability distribution for the sample proportion.
Expected value: (E[\bar{p}] = \rho)
For St. Andrew’s College, estimating the probability that a sample of 30 students reflects the true proportion of those wanting housing within +/- 0.05 of population proportion (0.72).
Unbiasedness: (E[\hat{\theta}] = \theta) (estimate equals population parameter).
Consistency: As sample size increases, estimator approaches the true parameter value.
Efficiency: Among unbiased estimators, preferred based on lowest variance.
Analyze multiple point estimators by their unbiasedness, consistency, and efficiency to find the best estimator for a given parameter.