1/20
Vocabulary flashcards covering key terms from the lecture notes on hypothesis testing, data prep, and plotting.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
p-value
The probability of observing a test statistic as extreme as the one observed, assuming the null hypothesis is true; used to assess evidence against H0 (not a hard cutoff).
null hypothesis (H0)
The default assumption of no effect or difference (e.g., μA = μB).
alternative hypothesis (H1)
The claim that there is an effect or difference (e.g., μA ≠ μB).
randomization test (permutation test)
A hypothesis test that builds its null distribution by repeatedly shuffling treatment labels and recalculating the test statistic.
test statistic
A numeric value derived from the data used to decide whether to reject H0 (e.g., difference in means).
difference in means
The difference between group means (μA − μB); the statistic used to compare two treatments.
sampling distribution under the null
The distribution of the test statistic that would occur if the null hypothesis were true, estimated via resampling under the null.
label reshuffling
Randomly reassigning treatment labels to observations to mimic the null hypothesis of no treatment effect.
NA
Not Available; missing values in data that require cleaning or handling before analysis.
data cleaning
The process of fixing typos, handling or removing missing values, and preparing data for analysis.
negation operator in R
The symbol '!' used to negate a condition (not, e.g., !is.na(x)); note: '!' is not factorial.
not equal operator in R
Operator '!=' used to test whether two values are not equal.
tidyverse
A collection of R packages for data manipulation and visualization (e.g., dplyr, ggplot2).
base R syntax
Traditional R commands, without the tidyverse; used for data processing such as subsetting, NA handling, etc.
violin plot
A plot showing the distribution of a continuous variable across groups, combining a kernel density plot with a box-like display.
block design
An experimental design feature where observations are grouped into blocks (e.g., morning vs afternoon) to reduce confounding.
facet_wrap
A ggplot2 function to split a plot into multiple panels by a factor.
treatment
A condition or group in an experiment (e.g., Treatment A vs Treatment B).
mean (mu)
The population mean; often denoted by the Greek letter mu (μ).
suites.csv
The CSV data file used in the hands-on example, containing treatment A/B and time data.
alpha (significance level)
The threshold for declaring statistical significance (commonly 0.05); reflects the probability of a Type I error.