1/91
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Individual
Objects described by a set of data (people, animals, things, etc.)
Variable
Any characteristic of an individual; can take different values.
Categorical variable
Places an individual into a group or category.
Quantitative variable
Takes numerical values for which arithmetic operations make sense.
Distribution
Tells what values a variable takes and how often.
Frequency table
Displays counts for each category.
Relative frequency table
Displays proportions or percentages for each category.
Two-way table
Describes two categorical variables.
Marginal distribution
Distribution of one variable among all individuals.
Conditional distribution
Distribution of one variable given a specific value of another variable.
Association
When knowing one variable helps predict another.
Dotplot
Graph showing each data value as a dot above a number line.
Stemplot
Displays data to show shape and distribution while retaining actual values.
Histogram
Displays distribution of a quantitative variable using bars.
Shape
Describes symmetry, skewness, peaks, and gaps.
Center
Describes typical value (mean, median).
Spread
Describes variability (range, IQR, standard deviation).
Outlier
Value that falls outside the overall pattern.
Resistant measure
Statistic not strongly affected by extreme values (median, IQR).
Density curve
Curve above the horizontal axis with area 1; shows distribution.
Median of a density curve
Divides area into two equal halves.
Mean of a density curve
Balance point of the curve.
Normal distribution
Symmetric, bell-shaped curve defined by mean (μ) and SD (σ).
68–95–99.7 Rule
Describes data within 1, 2, and 3 SDs of the mean.
Standard normal distribution
Normal distribution with mean 0 and SD 1.
z-score
Standardized value showing distance from mean in SDs: z = (x−μ)/σ.
Normal probability plot
Graph to assess Normality of data.
Scatterplot
Graph showing relationship between two quantitative variables.
Explanatory variable
Helps explain or predict changes in the response variable.
Response variable
Measures the outcome of a study.
Form
Overall pattern (linear, curved).
Direction
Indicates positive or negative association.
Strength
Describes how closely points follow a pattern.
Correlation (r)
Measures direction and strength of linear relationship.
Least-squares regression line (LSRL)
Line minimizing squared residuals: ŷ = a + bx.
Slope (b)
Change in predicted y for each 1-unit increase in x.
y-intercept (a)
Predicted value when x = 0.
Residual
Observed − predicted value (y − ŷ).
Coefficient of determination (r²)
Proportion of variation in y explained by x.
Residual plot
Graph of residuals versus x; checks fit of regression.
Influential point
Point that greatly changes correlation or slope if removed.
Population
Entire group we want to study or describe.
Sample
Subset of individuals from the population.
Census
Collects data from every individual in the population.
Sample survey
Collects data from a sample to generalize to the population.
Bias
Systematic error producing unrepresentative samples.
Voluntary response sample
People choose to participate; often biased.
Convenience sample
Chooses individuals easiest to reach; biased
Simple random sample (SRS)
Every group of n individuals has equal chance of selection.
Stratified random sample
Divides population into strata; SRS taken from each.
Cluster sample
Divides population into clusters; randomly selects clusters.
Undercoverage
Some groups left out of the sampling frame.
Nonresponse
Selected individuals can’t be contacted or refuse participation.
Response bias
Pattern of inaccurate answers due to wording or interviewer.
Observational study
Observes individuals without imposing treatment.
Experiment
Deliberately imposes treatment to measure response.
Explanatory variable (factor)
Variable manipulated in an experiment
Treatment
Specific condition applied to subjects.
Experimental units (subjects)
Individuals on which experiment is done.
Control group
Used for comparison; may receive placebo.
Random assignment
Uses chance to assign treatments; balances variables.
Replication
Using enough subjects to reduce chance variation.
Double-blind experiment
Neither subjects nor those interacting know treatments.
Statistically significant
Effect too large to be due to chance
Block design
Subjects grouped by similarity; treatments assigned within blocks
Matched pairs design
Compares two treatments using similar or same subjects.
Standard Deviation
The context typically varies by SD from the mean of mean.
Percentile:
percentile % of context are less than or equal to value.
z-score:
Specific value with context is z-score standard deviations above/below the mean.
Describe a distribution:
Be sure to address shape, center, variability, and outliers (in context).
Correlation (r):
The linear association between x-context and y-context is weak/moderate/strong
(strength) and positive/negative (direction).
Residual:
The actual y-context was residual above/below the predicted value when x-context = #.
y-intercept:
The predicted y-context when x = 0 context is y-intercept.
Slope:
The predicted y-context increases/decreases by slope for each additional x-context.
Standard Deviation of Residuals (s):
The actual y-context is typically about s away from the value
predicted by the LSRL.
Coefficient of Determination (r2):
About r2% of the variation in y-context can be explained by the
linear relationship with x-context.
Describe the relationship:
Be sure to address strength, direction, form and unusual features (in context).
Probability P(A):
After many many context, the proportion of times that context A will occur is about P(A).
Conditional Probability P(A|B):
Given context B, there is a P(A|B) probability of context A.
Expected Value (Mean, μ):
If the random process of context is repeated for a very large number of, the average number of x-context we can expect is expected value. (decimals OK).
Binomial Mean (μX):
After many, many trials the average # of success context out of n is μ#.
Binomial Standard Deviation (σX):
The number of success context out of n typically varies by σ#
from the mean of μ#.
Standard Deviation of Sample Proportions (σp%):
The sample proportion of success context typically varies by σ&' from the true proportion of p.
The sample proportion of success context typically varies by σ&' from the true proportion of p.
The sample mean amount of x-context typically varies
by σ*̅from the true mean of μ#.
Confidence Interval (A, B):
We are % confident that the interval from A to B captures the true
parameter context.
Confidence Level:
If we take many, many samples of the same size and calculate a confidence interval for each, about confidence level % of them will capture the true parameter in context
p-value:
Assuming H0 in context (H0), there is a p-value probability of getting the observed result
or less/greater/more extreme, purely by chance.
Conclusion for a Significance Test:
Because p-value p-value < / > α we reject / fail to reject H0. We
do / do not have convincing evidence for Ha in context.
Type 1 Error:
The H0 context is true, but we find convincing evidence for Ha context.
Type II Error:
The Ha context is true, but we don’t find convincing evidence for Ha context.
Power:
If Ha context is true at a specific value there is a power probability the significance test will
correctly reject H-.
Standard Error of the Slope (SEb):
The slope of the sample LSRL for x-context and y-context
typically varies from the slope of the population LSRL by about SE2.