Lecture Notes: Hypothesis Testing, Data Cleaning, and Visualization

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/20

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards covering key terms from the lecture notes on hypothesis testing, data prep, and plotting.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

21 Terms

1
New cards

p-value

The probability of observing a test statistic as extreme as the one observed, assuming the null hypothesis is true; used to assess evidence against H0 (not a hard cutoff).

2
New cards

null hypothesis (H0)

The default assumption of no effect or difference (e.g., μA = μB).

3
New cards

alternative hypothesis (H1)

The claim that there is an effect or difference (e.g., μA ≠ μB).

4
New cards

randomization test (permutation test)

A hypothesis test that builds its null distribution by repeatedly shuffling treatment labels and recalculating the test statistic.

5
New cards

test statistic

A numeric value derived from the data used to decide whether to reject H0 (e.g., difference in means).

6
New cards

difference in means

The difference between group means (μA − μB); the statistic used to compare two treatments.

7
New cards

sampling distribution under the null

The distribution of the test statistic that would occur if the null hypothesis were true, estimated via resampling under the null.

8
New cards

label reshuffling

Randomly reassigning treatment labels to observations to mimic the null hypothesis of no treatment effect.

9
New cards

NA

Not Available; missing values in data that require cleaning or handling before analysis.

10
New cards

data cleaning

The process of fixing typos, handling or removing missing values, and preparing data for analysis.

11
New cards

negation operator in R

The symbol '!' used to negate a condition (not, e.g., !is.na(x)); note: '!' is not factorial.

12
New cards

not equal operator in R

Operator '!=' used to test whether two values are not equal.

13
New cards

tidyverse

A collection of R packages for data manipulation and visualization (e.g., dplyr, ggplot2).

14
New cards

base R syntax

Traditional R commands, without the tidyverse; used for data processing such as subsetting, NA handling, etc.

15
New cards

violin plot

A plot showing the distribution of a continuous variable across groups, combining a kernel density plot with a box-like display.

16
New cards

block design

An experimental design feature where observations are grouped into blocks (e.g., morning vs afternoon) to reduce confounding.

17
New cards

facet_wrap

A ggplot2 function to split a plot into multiple panels by a factor.

18
New cards

treatment

A condition or group in an experiment (e.g., Treatment A vs Treatment B).

19
New cards

mean (mu)

The population mean; often denoted by the Greek letter mu (μ).

20
New cards

suites.csv

The CSV data file used in the hands-on example, containing treatment A/B and time data.

21
New cards

alpha (significance level)

The threshold for declaring statistical significance (commonly 0.05); reflects the probability of a Type I error.