Hypothesis Testing

Inference from Sample to Populations

  • The lecture focuses on statistical inference, which involves drawing conclusions about a population based on data from a sample.

Hypothesis Testing

  • Definition: Hypothesis testing is a method used to make decisions or inferences about a population based on sample data.
  • Null Hypothesis: A statement of no effect or no difference. It's the hypothesis that researchers try to disprove.
  • Alternative Hypothesis: A statement that contradicts the null hypothesis, suggesting there is a difference or effect.
  • Type-I Error: Rejecting the null hypothesis when it is actually true (false positive). Denoted by alpha\\alpha.
  • Type-II Error: Failing to reject the null hypothesis when it is actually false (false negative). Denoted by beta\\beta.
  • P-values: The probability of obtaining results as extreme as, or more extreme than, the observed results, assuming the null hypothesis is true. A small p-value suggests evidence against the null hypothesis.
  • Confidence Intervals: A range of values within which the true population parameter is likely to fall.
  • Single Mean Test: A statistical test used to determine whether a sample mean is significantly different from a known or hypothesized population mean.
  • Single Proportion Test: A statistical test used to determine whether a sample proportion is significantly different from a known or hypothesized population proportion.

STATA Demonstration

  • Demonstration of STATA commands for data analysis, including:
    • Upload “HospAdmNeu.dta” data in STATA.
    • summarize Ordinary1213 (Summary measures of ordinary admission).
    • summarize Ordinary1213, detail (Detailed summary of ordinary admission).
    • Graph box Ordinary1213 (Draw a boxplot to check for outliers).
    • scatter Ordinary1213 Daycase1213 (Scatter plot between two variables).
    • hist Ordinary1213 (Draw a histogram for ordinary admission).
    • hist Ordinary1213, bin(20) (Change the bin for better representation).
    • hist Ordinary1213, bin(20) normal (Check normality with a normal curve).
    • Homework: Repeat all commands for a day case hospital admission.

Statistical Inference: The Big Picture

  • Real World: Data is collected from the real world.
  • Theoretical World: Scientific and statistical models are used to represent the data.
  • Sample: A subset of the population from which data is collected.
  • Population: The entire group of individuals or objects of interest.
  • Conclusion: Inferences are made from the sample to the population.

Populations and Samples

  • A population is a collection of objects, people, or events of interest.
  • Data collection on the entire population is often impractical.
  • A sample is a subset of the population used to infer information about the population.
  • Samples are not of interest in their own right but for what they reveal about the population.

Example: Babies Birth Weight and Mum’s Smoking

  • Study in Newham, London, investigating causes of low birth weight in babies born in 2016.
  • Data collected from 1000 babies in Newham hospital database.
  • Dataset includes variables like baby ID, birth weight (bwt), gestational age (gest), mother’s age (mat_age), and number of cigarettes smoked per week (cigs).

Statistical Inference

  • Summary statistics (mean, percentiles, SD) are calculated from the sample.
  • These statistics are used to infer characteristics about the total population.
  • The goal is to understand what sample statistics tell us about the theoretical distributions of the population.

Random Samples

  • Research Question: Average weight of babies and reasons for underweight babies.
  • Theoretical Population: Defined before data collection to understand the generalizability of inferences.
  • Examples: all babies (current and future), babies born in the UK between 1990 and 2000.
  • Average Baby Weight: Around 7 pounds (3.2 kg) for females and 7 pounds 5 ounces (3.3 kg) for males.
  • Reasons for Underweight Babies: Premature birth, Intrauterine Growth Restriction (IUGR), infections during pregnancy, inadequate weight gain, smoking, alcohol or drug use, and maternal age.

Random Samples (contd.)

  • The sample is a subset of the population and needs to be representative.
  • Random Sampling: Each individual in the population has an equal chance of being included, and the inclusion of one individual does not affect the inclusion of another.
  • Opportunistic Sampling: Also known as convenience sampling, involves selecting participants based on availability (e.g., recruiting participants from a local support group for a rare neurological disorder like Stiff Person Syndrome).

Stratified Random Sampling

  • Example: Global Adult Tobacco Survey (GATS) in Bangladesh.
  • Overall prevalence of tobacco smoking:
    • 2009: 23.00% (95% CI 22.98 to 23.00)
    • 2017: 16.44% (95% CI 16.43 to 16.45)
  • Methodology: Two-stage stratified sampling.
    • First stage: Eight administrative divisions were created, stratified by urban and rural Enumeration Areas (EAs).
    • Second stage: 30 households were systematically selected from each sampled PSU (EA).
    • One participant was randomly picked from all eligible men and women in a participating household.

Statistical Inference: From Sample to Populations

  • Assessing the accuracy of sample statistics in estimating population parameters.
  • Repeatedly choosing samples from the same population results in different values for a statistic (e.g., the mean).
  • Uncertainty associated with the estimate needs to be assessed.

Sampling Variability and Standard Errors

  • The standard deviation of the sampling distribution of the mean (s/ns / \sqrt{n}) measures the typical error between the sample mean and the population mean.
  • s/ns / \sqrt{n} quantifies the accuracy of the sample mean as an estimate of the population mean and is known as the standard error (SE) of the mean.
  • Since the theoretical standard deviation (s) is unknown, the sample standard deviation (SD) is used in its place.

Standard Deviation (SD) vs Standard Error (SE)

  • Standard Deviation (SD): Measures how a typical observation in the sample differs from the sample mean.
  • Standard Error (SE): Quantifies the typical error between the mean measured in a sample and the theoretical mean in the population.
  • SD measures variability in the population or sample.
  • SE (=SD/n= SD/\sqrt{n}) measures variability in the sample means.

Importance of Normal Distribution in Medical Research

  • Central Tendency and Variability: Helps in understanding the mean and standard deviation of medical data.
  • Statistical Inference: Many tests and confidence intervals assume normality (e.g., t-tests and ANOVA).
  • Predictive Modeling: Used in risk assessment and predictive modeling.
  • Quality Control and Standardization: Used to monitor and maintain consistency of medical tests and procedures.

Sampling Distributions

  • The frequency distribution of the sample means is called the sampling distribution of the mean.
  • If the population distribution is Normal, the distribution of the sample mean over repeated samples is also Normal.
  • The variation in the sample means depends on the variance of the population (s2s^2) and the sample size n.
  • For large samples (n>30), the distribution of the sample mean is approximately normal, regardless of the population distribution.

Hypothesis Testing

  • Null and alternative hypotheses depend on the type of investigation.
    • To see if there is a difference between two procedures:
      • Null Hypothesis: No difference.
      • Alternative Hypothesis: There is a difference.
    • To find out if a bold claim is true:
      • Null Hypothesis: There is no difference.
      • Alternative Hypothesis: The claim is true (drug A is better/worse than drug B).

Task-1: State Null Hypothesis

  • In a criminal court, the accused is assumed innocent unless proven guilty.
  • Null Hypothesis: The accused is innocent.

Assumptions for Hypothesis Testing

  • My sample(s):
    1. Is representative
    2. Is independent
    3. Has homogeneous variance
    4. Is normal
  • Assumptions 1 and 2 are usually considered automatically met.
  • Assumptions 3 and 4 need to be tested using appropriate tools & techniques.

Types of Data

  • Quantitative:
    • Continuous (e.g., blood pressure, age).
    • Discrete (e.g., number of children, number of cigarettes per day).
  • Categorical:
    • Ordinal (ordered categories, e.g., grade of breast cancer, disease severity).
    • Nominal (unordered categories, e.g., sex, ethnicity).

Statistical Tests for Continuous Data

  1. If sample size >= 30 and assumptions are met, use normal distribution.
  2. If sample size < 30 and assumptions are met, use t-distribution.
  3. If assumptions are not met, transform variables and repeat steps 1 or 2.
  4. If none of the assumptions are met, use non-parametric tests.
  • Parametric tests are generally more powerful.

Appropriate Tests for Continuous Data

  • One-Sample:
    • t-test
    • Sign test
  • Paired (2 groups):
    • Paired t-test
    • Wilcoxon signed rank test
    • Sign test
  • Independent (2 groups):
    • Unpaired t-test
    • Wilcoxon rank sum test
  • Independent (>2 groups):
    • One-way ANOVA
    • Kruskal-Wallis ANOVA

Categorical Data

  • Categorical covariate data are often called factors
  • Categorical data that take on only two distinct values are said to be dichotomous or binary
  • Categorical data are often coded using numerical values (e.g. 0 = NO, 1 = YES) –
    statistical packages usually treat numeric data as quantitative unless you explicitly declare it to be categorical
  • Limiting factor for any continuous observation is the accuracy of the measurement instrument

Appropriate Tests for Categorical Data

  • 1 group:
    • z test for a proportion
    • Sign test
  • Paired (2 categories):
    • McNemar's test
  • Independent (2 groups):
    • Chi-squared test
    • Fisher's exact test
  • Independent (>2 groups):
    • Chi-squared test
    • Chi-squared trend test
  • >2 Categories:
    • Chi-squared test

Type I and Type II Errors

DecisionH0 is trueH0 is false
Do not reject H0Correct decisionType II error (β)
Reject H0Type I error (α)Correct Decision (1-β)

P-value and Confidence Interval (CI)

  • P-value: Measures the strength of evidence against the null hypothesis.
    • P > 0.10: No evidence against null hypothesis.
    • 0.05 < P < 0.10: Weak evidence against null hypothesis.
    • 0.01 < P < 0.05: Moderate evidence against null hypothesis.
    • 0.001 < P < 0.01: Strong evidence against null hypothesis.
    • P < 0.001: Very strong evidence against null hypothesis.
  • Confidence Interval (CI): Range of values within which the true population value is likely to be found.
    • If the CI for the difference in mean scores excludes ‘0’, there is strong evidence against the null hypothesis.

Confidence Interval for a Single Population Mean

  • A range of likely values for the parameter.
  • To construct a confidence interval for the population mean, make use of the following pieces of information:
    • Sample mean
    • Standard Error of the mean
  • 95% of sample means lie within 1.96 SE above or below the population mean

Confidence Interval (contd.)

  • Approximated 1.96~2
  • There is 95% probability that this interval contains the unknown but true value of the population mean

Example: Finding Single Mean and 95% CI

  • Baby weight study with 1000 babies in Newham hospital.
  • Random sample of 1000 babies chosen, 997 considered after deleting unusual weights.
  • Mean weight (x) = 3305 gm; SD = 505 gm.
  • SE (mean) = 505/997=16505/ \sqrt{997} = 16 gm.
  • 95% CI for mean = xz(5%)×SE(x)x - z(5\%) \times SE(x) to x+z(5%)×SE(x)x + z(5\%) \times SE(x)
    • = 3305 - 2 × 16 to 3305 + 2 × 16
    • = 3273 to 3337 gm
  • We are 95% confident that the true mean birth weight is between 3273 and 3337 gm.

Single Proportion

  • If out of total sample of size n only d individuals from our sample experience some event:
    p=dnp = \frac{d}{n}

Example (vaccination):

  • In a trial of a new vaccine, 20 out of 1000 children vaccinated showed signs of adverse reaction:

p=201000=0.020=2.0%p = \frac{20}{1000} = 0.020 = 2.0\%, thus advising parents that the vaccine is associated with an estimated 2% risk of adverse reaction.

Single Proportion and its 95% CI – Vaccination Example

  • Step 1: Calculate proportion:p=201000=0.020p = \frac{20}{1000} = 0.020
  • Step 2: Calculate Standard error:
    se(p)=0.02(10.02)1000=0.004se(p) = \sqrt{\frac{0.02(1-0.02)}{1000}} = 0.004
  • Step 3: Calculate 95% Confidence interval:
    • 0.021.96×0.0040.02-1.96 \times 0.004 to 0.02+1.96×0.0040.02 +1.96 \times 0.004
    • = 0.012 to 0.027 = 1.2% to 2.7%

Practice Example: Single Proportion and its 95% CI

  • Smoking habits survey in Birmingham, UK, among 1000 teenagers aged 15-16 in 2001.
  • 123 reported being current smokers.
  • Find the proportion of teenagers who smoked and the 95% confidence interval.

Practice Example : Single proportion and its 95% CI – Birmingham teenage smoking habit - solution of class exercise

  • Step 1 : calculate proportion, p=1231000=0.123p = \frac{123}{1000} = 0.123
  • Step 2: Calculate the standard error of the proportion, se(p)=0.123(10.123)1000=0.0104se(p) = \sqrt{\frac{0.123(1−0.123)}{1000}} = 0.0104
  • Step 3: Calculate 95% CI using: p1.96×se(p)p −1.96×se(p), p+1.96×se(p)p + 1.96×se(p)
  • This gives, 0.123−1.96×0.0104 and 0.123 + 1.96×0.0104 = 0.103 to 0.143 = 10.3% to 14.3%

Homework

  • Upload babies data from Moodle using STATA.
  • Using STATA command work out i) mean & sd of babies weight
  • Using STATA command work out 95% CI for mean baby weight & comment on your findings.

Recommended Reading

  1. Practical Statistics for medical research by Douglas Altman : Chapter 10, page 232-234.
  2. Medical Statistics by B. Kirkwood & J. Sterne : Chapter-6
  3. Statistics notes: The normal distribution BMJ 1995; 310 doi: https://doi.org/10.1136/bmj.310.6975.298
  4. Statistics Notes: Standard deviations and standard errors. BMJ. 2005 Oct 15; 331(7521): 903. doi: 10.1136/bmj.331.7521.903
  5. Comparison between t and normal distribution – separate file uploaded in Moodle (3 slides only)