Chapter 6: Comparing two means (Independent Samples)

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/20

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

21 Terms

1
New cards

Q: What are we typically comparing in two-sample quantitative problems?

A: The means of two independent populations or groups.

2
New cards

What are quartiles?

For any set of 𝑛 measurements, the 𝒑𝒕𝒉 percentile is a number such that 𝑝% of the measurements fall below the 𝑝𝑡ℎ percentile and (100 – 𝑝)% of the measurements fall above it.

3
New cards

What are the five-number summary components?

Minimum, Q1 (first quartile), Median (Q2), Q3 (third quartile), Maximum. This summary gives an overview of the distribution’s spread and center.

<p><strong>Minimum, Q1 (first quartile), Median (Q2), Q3 (third quartile), Maximum</strong>. This summary gives an overview of the distribution’s spread and center.</p>
4
New cards

What are quartiles, and how are they defined?

Q1 (25th percentile): Median of the lower half of the data

Q2 (50th percentile): The median of the data set

Q3 (75th percentile): Median of the upper half of the data

They divide the data into four equal parts.

<p>• <strong>Q1 (25th percentile):</strong> Median of the lower half of the data</p><p class="p1">• <strong>Q2 (50th percentile):</strong> The median of the data set</p><p class="p1">• <strong>Q3 (75th percentile):</strong> Median of the upper half of the data</p><p class="p2">They divide the data into four equal parts.</p>
5
New cards

How do you calculate quartiles when the number of observations is odd vs. even?

Odd: Omit the middle value (median) when calculating Q1 and Q3

Even: Include all values when locating the first and third quartiles

<p>• <strong>Odd:</strong> Omit the middle value (median) when calculating Q1 and Q3</p><p class="p1">• <strong>Even:</strong> Include all values when locating the first and third quartiles</p>
6
New cards

 What is the Interquartile Range (IQR) and why is it important?

IQR = Q3 − Q1. It measures the spread of the middle 50% of the data and is a resistant measure, meaning it is not affected by outliers.

<p><strong>IQR = Q3 − Q1</strong>. It measures the spread of the middle 50% of the data and is a <strong>resistant</strong> measure, meaning it is not affected by outliers.</p>
7
New cards

How are outliers detected using the IQR method?

Lower Fence: Q1 − 1.5 × IQR

Upper Fence: Q3 + 1.5 × IQR

Values outside these fences are considered potential outliers.

<p>• <strong>Lower Fence:</strong> Q1 − 1.5 × IQR</p><p class="p1">• <strong>Upper Fence:</strong> Q3 + 1.5 × IQR</p><p class="p2">Values outside these fences are considered potential outliers.</p>
8
New cards

What is a boxplot and what does it visualize?

• Longer whiskers indicate skewness

• Dots outside whiskers are potential outliers

• Symmetry or skewness can be visually assessed

9
New cards

What are the limitations of boxplots?

They do not show frequency, multimodality, or clustering. Best used with dotplots or histograms for a fuller picture.

10
New cards

How do you assess normality using standard deviation intervals?

• 68% of data within x̄ ± s

• 95% within x̄ ± 2s

• 100% within x̄ ± 3s

Applies when data is approximately normal.

11
New cards

What is a normal probability plot (Q-Q plot) and how do you interpret it?

 A scatterplot comparing sorted data values with expected normal values. If points fall close to a straight line → data is likely normal.

<p>&nbsp;A scatterplot comparing sorted data values with expected normal values. If points fall close to a straight line → data is likely normal.</p>
12
New cards

What is the simulation-based approach for comparing two means?

Randomly shuffle group labels to simulate the null hypothesis that there’s no difference between groups.

13
New cards

What are the hypotheses used in a simulation comparison?

H₀: μ₁ = μ₂ (no difference)

Hₐ: μ₁ ≠ μ₂ or μ₁ > μ₂ or μ₁ < μ₂ (depending on the scenario)

14
New cards

 How is the p-value interpreted in simulation?

It represents the proportion of simulated differences as extreme or more extreme than the observed difference.

15
New cards

What is the 2SD method for confidence intervals?

Estimate = observed statistic ± 2 × (SD from simulation)

E.g., 0.714 ± 2(0.302) → (0.110, 1.138)

16
New cards

Comparing two means: Theoretical approach

What are the three validity conditions for two-sample t-procedures?

1. n ≥ 30 in both samples (CLT applies)

2. Populations are approximately normal

3. Robust condition: n ≥ 20 and not strongly skewed

17
New cards

When variances are equal, what procedure is used?

Use the pooled t-test with pooled variance and degrees of freedom: df = n₁ + n₂ − 2

18
New cards

What is the Welch’s t-test and when is it used?

Used when variances are unequal. It uses the Satterthwaite approximation for degrees of freedom.

19
New cards

What does it mean if 0 is not in the confidence interval for μ₁ − μ₂?

There’s significant evidence of a difference between the two means → Reject H₀.

20
New cards

What non-parametric tests can be used instead of t-tests?

Wilcoxon Rank Sum Test: compares ranked medians

Kolmogorov–Smirnov Test (KS): compares entire distributions

21
New cards

Connection Between CI & Hypothesis Testing

knowt flashcard image