1/35
Based off Introduction to Modern Statistics (2e) Chapte5rs 1-
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
data
observations collected from field, surveys, experiments
statistics
study of how best to collect analyze, draw conclusions from data
treatment group
group that gets the treatment
control group
group that does not receive the treatment
summary statistics
a single number summarizing data from a sample.
(unit of) observation / case
a row
variable
a characteristic of a case/observation represented in a column
data frame
a convenient and common way to organize data, especially if collecting data in a spreadsheet
tidy data
where each row is a unique case (observational unit), each column is a variable, and each cell is a single value
numerical variable
can take a wide range of numerical values, and it is sensible to add, subtract, or take averages with those value
Discrete
can only take numerical values with jump. compare to continuous.
Categorial
possible values are called a variables level
ordinal
a categorical variable, but the levels have a natural ordering
nominal
a regular categorical variable without this type of special ordering
sample statistic
n a number is being calculated on a sample of data
Population Parameter
considered for calculation on the entire population
bias
overrepresent that person’s interests, which may be entirely unintentional
Stratified sampling
The population is divided into groups called strata. The strata are chosen so that similar cases are grouped together, then a second sampling method, usually simple random sampling, is employed within each stratum.
Simple random sampling
each case in the population has an equal chance of being included in the final sample and knowing that a case is included in a sample does not provide useful information about which other cases are included.
cluster sample
break up the population into many groups, called clusters. Then we sample a fixed number of clusters and include all observations from each of those clusters in the sample.
multistage sample
like a cluster sample, but rather than keeping all observations in each cluster, we would collect a random sample within each selected cluster
Randomized Ezperiment
Studies where the researchers assign treatments to cases are called experiments. When this assign ment includes randomization, e.g., using a coin flip to decide which treatment a patient receives, it is called a randomized experiment
confounding variable
a variable that is associated with both the explanatory and response variables. Randomizing patients into the treatment or control group helps even out such differences
replication
The more cases researchers observe, the more accurately they can estimate the effect of the explanatory variable on the response. In a single study, we replicate by collecting a sufficiently large sample.
replication crisis
refers to the ongoing methodological crisis in which past findings from scientific studies in several disciplines have failed to be replicated.
Pseudoreplication
occurs when individual observations under different treatments are heavily dependent on each other.
blocking
Researchers sometimes know or suspect that variables, other than the treatment, influence the response. Under these circumstances, they may first group individuals based on this variable into blocks and then randomize cases within each block to the treatment groups. This strategy is often referred to as blocking.
treatment group
receives treatment
control group
does not get the treatment
blind
When researchers keep the patients uninformed about their treatment, the study is said to be blind
double-blind
where doctors or researchers who interact with patients are, just like the patients, unaware of who is or is not receiving the treatment.
placebo
give a fake treatment to patients in the control group
placebo effect
Oftentimes, a placebo results in a slight but real improvement in patients. This effect has been dubbed the placebo effect.
observational studies
Studies where no treatment has been explicitly applied (or explicitly withheld)
Making causal conclusions based on experiments is often reasonable, since we can randomly assign the explanatory variable(s), i.e., the treatments. However, making the same causal conclusions based on observational data can be treacherous and is not recommended.
prospective study
identifies individuals and collects information as events unfold.
retrospective
collect data after events have taken place