9.+Inferring+Population+Means

Probability Distribution

Discrete and Continuous Outcomes
  • Discrete Outcomes: Outcomes that can take specific values, often counts (e.g., number of students).

  • Continuous Outcomes: Outcomes that can take any value within a given range (e.g., height, weight).

Probabilities for Random Variables
  • Discrete Random Variables: Use probability mass functions (PMF).

  • Continuous Random Variables: Use probability density functions (PDF).

Normal Model
  • Defined by mean (µ) and standard deviation (σ).

  • Standard Normal Model: A normal distribution with mean of 0 and standard deviation of 1.

R Functions
  • To find probabilities from normal distribution: pnorm().

  • To find percentiles: qnorm().

Characteristics of Binomial Models
  • Defined by number of trials (n) and probability of success (p).

  • Model Symbols: P(X = k) indicates the probability of k successes in n trials.

  • Use pbinom() in R to find probabilities.

Mean and Standard Deviation of Binomial Distribution
  • Mean: µ = np

  • Standard Deviation: σ = √(np(1 - p))

Population Parameter vs. Sample Statistics
  • Population Parameter: A value describing a characteristic of a population (e.g., population mean µ).

  • Sample Statistics: A value describing a characteristic of a sample (e.g., sample mean x̄).

  • Common symbols: Population mean (µ), Sample mean (x̄), Population proportion (P), Sample proportion (p̂).

Central Limit Theorem (CLT) for Sample Proportions
  • Conditions: Random sampling, independent samples, large sample size (n * p ≥ 10 and n * (1 - p) ≥ 10).

  • Distribution: As sample size increases, the distribution of sample proportions approaches a normal distribution.

Confidence Interval of Population Proportions
  • Interpretation: A range of values that likely contains the true population proportion, with a specified confidence level.

Hypothesis Testing for Population Proportions
  • Null Hypothesis (H0): Assumes no effect or no difference.

  • Alternative Hypothesis (H1): Assumes there is an effect or a difference.

  • Types: One-sided and two-sided tests.

  • Significance Level (α): The probability of rejecting the null hypothesis when it is true.

  • Interpreting Test Statistics: Compare test statistic to critical value or use p-value approach.

R Functions for Confidence Intervals and Hypothesis Testing
  • For calculating confidence intervals: prop.test() for proportions.

  • For hypothesis testing, also use: prop.test().

Type I Error and Type II Error
  • Type I Error: Rejecting H0 when it is true.

  • Type II Error: Failing to reject H0 when H1 is true.

Comparing Proportions from Two Populations
  • Use two-sample tests for proportions to compare population differences.

Central Limit Theorem for Sample Means
  • Conditions: Random sampling, independence, large sample size (n ≥ 30).

  • Distribution: Approaches normal as sample size increases.

t-Statistic and t-Test
  • Used for hypothesis testing for means when population standard deviation is unknown.

  • Shapiro-Wilk Test: Assess normality (H0: data is normally distributed).

  • Variance Test: Test for equality of variances (H0: variances are equal).

Independent vs Paired Samples
  • Independent Samples: Samples not related.

  • Paired Samples: Measurements taken from the same group at different times (e.g., pre-test and post-test).

R Functions for t-Test
  • For one-sample t-test: t.test(x = , mu = ).

  • For two-sample t-test: t.test(x = , y = ).

Interpretation of t-Test Results
  • Compare p-value to significance level to draw conclusions about H0.