Sampling Techniques and Sample Size Determination

Fundamentals of Sampling and Population

Key Definitions

Sample: A specific set of elements or individuals selected from a larger population for the purpose of a study.
Population: The complete set of elements or people from which the sample is drawn and to which the results are intended to generalize.
Sampling: The formal process of drawing elements from a population to construct a sample.
Representative Sample: A sample that accurately reflects or resembles the characteristics of the population from which it was drawn.
Equal Probability Method of Selection Method (EPSEM): A sampling procedure in which every individual element in the population has an exactly equal probability of being selected into the sample.

Statistics vs. Parameters

Statistic: A numerical characteristic derived from sample data. Common examples include the sample mean ( $\bar{x}$ ) and the sample standard deviation ( $s$ ).
Parameter: A numerical characteristic representing the entire population. Common examples include the population mean ( $\mu$ ) and the population standard deviation ( $\sigma$ ).
Sampling Error: The numerical difference between the value of a sample statistic and the actual value of the population parameter.
Sampling Frame: A comprehensive list of every element within a population from which the sample is drawn.
Response Rate: The percentage of individuals originally selected for the sample who actually complete the study or participate in the research.

Random Sampling Techniques

Simple Random Sampling

This technique involves choosing a sample such that every member of the population has an equal chance of being selected (adhering to the EPSEM principle).
Process simplification is achieved through random number generators.
Online resources for this purpose include www.randomizer.org and www.random.org.

Stratified Random Sampling

This method involves drawing random samples from different specific groups or sub-populations, known as strata.
Group Requirements: Strata must be mutually exclusive (meaning an individual can only belong to one group).
Strata Types: Groups can be based on categorical data (nominal or ordinal) or quantitative data (interval or ratio).
Proportional Stratified Sampling: This specific form ensures that the proportion of each subgroup in the sample matches the proportion of that subgroup in the total population.

Example of Proportional Stratified Sampling

Strata: Gender (Males/Females).
Population: Presidents of the American Psychological Association (APA), where $N = 122$ .
Population Distribution: 14 female presidents ( $11\%$ ) and 108 male presidents ( $89\%$ ).
Desired Sample Size ( $n$ ): 100 individuals.
Resulting Sample Selection: 11 female presidents and 89 male presidents are drawn randomly to maintain the $11\%$ and $89\%$ distribution.

Cluster Random Sampling

This involves the random selection of established groups (clusters) rather than individuals.
Clusters: Collective units containing multiple elements, such as neighborhoods, families, schools, or classrooms.
One-stage Cluster Sampling: The researcher randomly selects clusters and includes every individual within those selected clusters in the study. (e.g., Randomly selecting 15 psychology classrooms and testing every student in them).
Two-stage Cluster Sampling: The researcher first randomly selects clusters and then randomly selects individual participants within those chosen clusters. (e.g., Randomly selecting 30 psychology classrooms and then randomly selecting 10 students from each of those classrooms).

Systematic Sampling

This technique follows a three-step procedural logic:
1. Determine the sampling interval ( $k$ ) by dividing the population size ( $N$ ) by the desired sample size ( $n$ ): $k = \frac{N}{n}$ .
2. Randomly select a starting number between 1 and $k$ . This person is the first participant.
3. Include every $k^{th}$ element following that starting number in the sample.
Periodicity: A potential (though uncommon) problem where a cyclical pattern exists in the sampling frame that happens to coincide with the interval $k$ , leading to a biased sample.

Example of Systematic Sampling

Population ( $N$ ): 100
Sample ( $n$ ): 10
Interval ( $k$ ): 10
Selection: If the randomly selected starting number between 1 and 10 is 5, the sample will include the $5^{th}$ person, plus the $15^{th}$ , $25^{th}$ , $35^{th}$ , $45^{th}$ , $55^{th}$ , $65^{th}$ , $75^{th}$ , $85^{th}$ , and $95^{th}$ .

Nonrandom Sampling Techniques

Nonrandom techniques generally produce biased and non-representative samples.

Convenience Sampling: Using participants who are readily available to the researcher (e.g., surveying college students on campus).
Quota Sampling: The researcher identifies specific quotas for individual groups (e.g., 25 males and 25 females; or 15 freshmen, 15 sophomores, 15 juniors, and 15 seniors) but uses convenience sampling to fill those quotas.
Purposive Sampling: Identifying and selecting individuals who possess specific characteristics required for the study (e.g., recruiting only college freshmen diagnosed with ADHD).
Snowball Sampling: Existing research participants help identify and recruit other potential participants. This is especially useful for hard-to-reach populations, such as Spanish-speaking ESL students or parents of children with autism.

Random Selection vs. Random Assignment

Random Selection: The process of choosing participants from the population to be in the sample. Its primary purpose is to ensure the sample is representative of the population.
Random Assignment: The process of assigning the chosen participants to different experimental conditions or groups (e.g., treatment vs. control). Its primary purpose is to create equivalent groups to allow for the investigation of causality.
- Example: 20 students sign up for a study and are subsequently randomly assigned to either the experimental treatment group or the control group.

Determining Sample Size

General Rules of Thumb

If the population size is less than 100, use the entire population.
Larger sample sizes increase the likelihood of detecting an effect or relationship.
Conduct a literature review to see sample sizes used in similar research.
Utilize sample size calculators like G*Power.

When to Increase Sample Size

Heterogeneity: When the population is composed of widely different types of people.
Subcategory Analysis: When you intend to break the sample down into multiple categories (e.g., comparing males and females separately).
Precision: When you need a narrow or more precise confidence interval.
Effect Size: When the expected effect or relationship is small or weak.
Sampling Efficiency: When using less efficient methods, such as cluster sampling.
Statistical Requirements: Certain high-level statistical techniques require larger $n$.
Attrition: When a low response rate is anticipated.

Statistical Power and Error Types

Type I Error (Alpha, $\alpha$ ): A "false positive" error. This occurs when the researcher erroneously rejects a null hypothesis that is actually true in the population ( $\alpha = 0.05$ ).
Type II Error (Beta, $\beta$ ): A "false negative" error. This occurs when the researcher erroneously accepts (fails to reject) a null hypothesis that is actually false in the population ( $\beta = 0.20$ ).
Statistical Power ( $1 - \beta$ ): The probability of correctly rejecting the null hypothesis. It represents the probability of finding an effect if one truly exists. A power level of at least $0.80$ is generally desired.

Key Determinants of Power

Decision Criterion ( $\alpha$ ): Usually set at $0.05$ .
Sample Size ( $n$ ): As sample size increases, power increases, though this also increases cost and time.
Effect Size: The magnitude of the relationship between variables. It can be estimated as small, medium, or large based on literature.
Desired Power ( $1 - \beta$ ): Typically targeted at $0.80$ .

Note: If three of these elements are fixed, the fourth can be mathematically derived.

Impact of Power Levels

High Power Implications

Fewer False Negatives.
Lower overall errors of inference.

Low Power Implications

HARKing: "Hypothesizing After the Results are Known." This leads to a body of conflicting evidence in research literature.
Publication Bias: The presence of many false-positives in literature because studies with low power may only be published if they happen to find a significant result by chance.

Practical Power Analysis

Software Tools

G*Power: A comprehensive tool for estimating required sample sizes given effect size and alpha.
jamovi: Includes the jpower module for power analysis, though it is currently primarily available for t-tests.

Common Errors and Problems

Post Hoc Power: Power calculated after the results are known is considered trivial and misleading. Sensitivity Analysis is preferred.
Estimating Effect Size (ES): Planning a study requires the population's expected ES, which is often unknown. Researchers must rely on meta-analyses, previous studies, or educated hunches.

G*Power Calculation Examples

Example 1: Independent Samples t-test

Goal: Compare reaction times between a Cold Shower Group and a Control Group.
Hypothesis: Cold showers lead to faster reaction times (One-tailed).
Settings:
- Test Family: t tests
- Statistical Test: Means: difference between two independent means
- Effect Size ( $d$ ): $0.5$ (Medium)
- Alpha ( $\alpha$ ): $0.05$
- Power ( $1 - \beta$ ): $0.80$
Result: Total sample size required is 102 participants (51 per group).

Example 2: One-way ANOVA

Goal: Compare sustained attention across three video game genres (Action, Puzzle, Simulation).
Settings:
- Test Family: F tests
- Statistical Test: ANOVA: Fixed effects, omnibus, one-way
- Number of groups: $3$
- Effect Size ( $f$ ): $0.25$ (Medium)
- Alpha ( $\alpha$ ): $0.05$
- Power ( $1 - \beta$ ): $0.80$
Result: Total sample size required is 159 participants (53 per group).

Example 3: Paired Samples t-test (Matched Pairs)

Goal: Compare PSQI sleep quality scores before and after a week of deep breathing exercises.
Hypothesis: Lower scores (better sleep) after the intervention (One-tailed).
Settings:
- Test Family: t tests
- Statistical Test: Means: Difference between two dependent means
- Effect Size: $0.5$ (Medium)
- Alpha ( $\alpha$ ): $0.05$
- Power ( $1 - \beta$ ): $0.80$
Result: Minimum of 27 participants required.

Example 4: 2x2 Between-Subjects ANOVA

Goal: Explore effects of Video Content Type (Educational vs. Entertainment) and Device Type (Smartphone vs. Laptop).
Factors: 2 Factors with 2 levels each = 4 groups total.
Degrees of Freedom ( $df$ ):
- Factor A: $2 - 1 = 1$
- Factor B: $2 - 1 = 1$
- Interaction ( $A \times B$ ): $(2 - 1) \times (2 - 1) = 1$
Settings:
- Test Family: F tests
- Statistical Test: ANOVA: Fixed effects, special, main effects and interactions
- Number of groups: $4$
- Numerator $df$ : $1$
- Effect Size ( $f$ ): $0.25$ (Medium)
- Alpha ( $\alpha$ ): $0.05$
- Power ( $1 - \beta$ ): $0.80$
Result: Total sample size required is 128 participants (32 per group).