Sample Size Determination & Standard Error
Page 1 – Deriving the Required Sample Size ### Key Given Quantities - Population mean: \mu - Population standard deviation: \sigma
(the numerical value is not stated explicitly, but the algebra will reveal it) - Desired standard error of the mean (SEM): SE = 5 - Unknown to determine: sample size n ### Fundamental Formula The standard error of the mean is defined as
SE = \frac{\sigma}{\sqrt{n}} ### Algebraic Solution for n 1. Impose the target SEM:
5 = \frac{\sigma}{\sqrt{n}} 2. Clear the denominator (multiply both sides by \sqrt{n}):
5\sqrt{n} = \sigma 3. Isolate the square‐root term (divide both sides by 5):
\sqrt{n} = \frac{\sigma}{5} 4. Square both sides to solve for n:
n = \left(\frac{\sigma}{5}\right)^2 ### Numerical Result Because the final notes report n = 36, the implicit population standard deviation must have been
\sigma = 30
(since 5\times6 = 30 and \sqrt{36}=6).
Therefore, a sample of 36 observations is required so that the sampling distribution of the mean will have a standard error of exactly 5. ### Connections & Implications - The derivation hinges on the inverse‐square‐root relationship between SEM and sample size: doubling the sample size does not halve the SEM; SEM shrinks at the slower rate of 1/\sqrt{n}. - In practical study design, knowing \sigma (from pilot studies or historical data) lets researchers budget an adequate sample size before collecting data. - A larger \sigma inflates the required n quadratically; conversely, tolerating a larger SEM (looser precision) would reduce the necessary n dramatically. ## Page 2 – Plain-Language Explanation & Intuition ### Everyday Analogy Imagine a city’s residents. You already know their overall average age \mu and how much ages typically vary \sigma. To save effort, you will not interview everyone; you will select a smaller group (sample) and compute its average age. - Population – the entire city. - Sample – a manageable subset of residents. - Sample mean – the average age you find in that subset. - Standard error – how far that sample mean is likely to drift from the true city‐wide mean just by random chance. ### Why Controlling SEM Matters A smaller SEM means greater precision—your sample mean is likely to land close to \mu. Policymakers, quality‐control engineers, and scientists all care about this precision to make confident decisions. ### Relationship Re-stated The quantitative law
SE = \frac{\sigma}{\sqrt{n}}
shows: - Bigger \sigma → bigger uncertainty (harder to estimate the mean precisely). - Bigger n → smaller uncertainty (more information per average). ### Step-by-Step Arithmetic Recap 1. Start with 5 = \dfrac{\sigma}{\sqrt{n}}. 2. Rearrange to \sqrt{n} = \dfrac{\sigma}{5}. 3. Square: n = \left(\dfrac{\sigma}{5}\right)^2. 4. Substitute \sigma = 30 → n = \left(\dfrac{30}{5}\right)^2 = 6^2 = 36. ### Practical Takeaway A sample of 36 people is large enough that, on average, the sample mean age will be within \pm5 years of the true population mean age. \n
• If you could only survey 9 people, the SEM would jump to \dfrac{30}{\sqrt{9}} = 10, doubling your expected error.
• Survey 144 people, and SEM would fall to \dfrac{30}{12} = 2.5, halving the error. ### Broader Relevance - Experimental Design – Clinical trials, manufacturing audits, and opinion polls all face the same balancing act: larger n improves precision but costs more time and money. - Ethical Aspect – Oversampling wastes resources; undersampling risks misleading conclusions. Correct calculation aligns ethics with efficiency. - Statistical Foundation – This exercise exemplifies the Central Limit Theorem, which guarantees that