1/71
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Categorical Variables
Represent qualities or characteristics. Example: hair color, grade level. They do not have numerical meaning.
Nominal
No inherent order (e.g., color).
Ordinal
Can be ordered but differences are not meaningful (e.g., class rank).
Quantitative Variables
Numerical and measurable. Example: height, number of books.
Discrete
Countable (e.g., number of pets).
Continuous
Measurable on a continuum (e.g., weight).
Bar Chart
Categorical. Shows counts/frequencies using bars.
Pie Chart
Categorical. Shows proportion of categories as slices.
Dotplot
Quantitative. Dots represent individual data points.
Stemplot
Quantitative. Shows distribution while retaining original values.
Histogram
Quantitative. Bars represent intervals (bins) of values.
Boxplot
Summarizes five-number summary; good for comparing groups.
Comparative Boxplots
Multiple boxplots on the same scale for comparison.
Shape
Symmetric (mirror image), Skewed Right (tail on right), Skewed Left (tail on left), Uniform (flat), Bimodal (two peaks).
Mean
Average value. Sensitive to outliers.
Median
Middle value. Resistant to outliers.
Range
Max - Min.
IQR (Interquartile Range)
Q3 - Q1. Resistant to outliers.
Standard Deviation
Average distance from the mean. Sensitive to outliers.
Outliers
Use 1.5 * IQR rule. Any value < Q1 - 1.5IQR or > Q3 + 1.5IQR.
Percentiles
Percent of values below a given point.
Z-scores
How many standard deviations a value is from the mean. z = (x - μ)/σ.
Simple Random Sample (SRS)
Every individual has equal chance. Use random numbers.
Stratified Random Sample
Divide population into strata, randomly sample from each.
Cluster Sample
Randomly select clusters (groups) and sample all members.
Systematic Sample
Select every nth individual after a random start.
Convenience Sample
Easy to reach; usually biased.
Voluntary Response
People choose to respond; biased toward strong opinions.
Control
Reduces lurking variables.
Randomization
Reduces bias.
Replication
Use enough subjects to generalize results.
Placebo Effect
Subjects respond to belief in treatment.
Blinding
Subjects unaware of treatment.
Double-Blind
Subjects and experimenters unaware.
Completely Randomized Design
Random assignment to groups.
Block Design
Group by similar traits, randomize within.
Matched Pairs
Pair similar units or use the same unit twice.
Undercoverage
Some groups underrepresented.
Nonresponse
Chosen individual does not participate.
Response Bias
Wording, interviewer, or dishonesty skews answers.
Sampling Variability
Different samples give different estimates.
Total probability
Total probability = 1.
Addition Rule
P(A or B) = P(A) + P(B) - P(A and B).
Multiplication Rule
P(A and B) = P(A) * P(B|A). If independent, P(A) * P(B).
Mutually Exclusive
Can't happen together. P(A and B) = 0.
Independent
One doesn't affect the other.
Conditional Probability
P(B|A) = P(A and B) / P(A). Interpret as: given A happened, what is the probability of B?
Two-way tables
Help visualize relationships between two categorical variables.
Venn diagrams
Help visualize relationships between sets.
Tree diagrams
Help visualize sequences of events and their probabilities.
Simulation
Use random digits or technology to simulate repeated trials.
Discrete
Countable outcomes (e.g., # of goals).
Continuous
Measurable (e.g., time).
Mean (Expected Value)
Mean = Σ[x * P(x)].
Standard Deviation (SD)
SD is square root of variance.
Transformations
Multiply changes spread and center. Add only changes center.
Combining Variables
If independent: Add variances to combine SDs.
Binomial
Conditions: BINS.
Binomial PDF
Use binompdf (for exact) or binomcdf (for cumulative).
Geometric
First success on the kth trial.
Proportions Mean
Mean = p, SD = √[p(1-p)/n].
Normal Approximation for Proportions
np ≥ 10, n(1-p) ≥ 10.
Means Mean
Mean = μ, SD = σ/√n.
Central Limit Theorem
Sampling distribution is approximately normal if n ≥ 30.
Confidence Intervals
Form: statistic ± (critical value)(standard error).
One-Proportion z Test
z = (p̂ - p₀)/√[p₀(1 - p₀)/n].
One-Sample t Test
Use when σ is unknown. df = n - 1.
Two-Proportion z Test
Compare two groups. z = (p̂1 - p̂2)/SE.
Two-Sample t Test
Compare means. Assume unequal variance unless told.
Paired t-Test
Use differences: x̄diff ± t*(sdiff/√n).
Chi-Square
GOF: 1 variable vs distribution; Homogeneity: Compare 2 groups; Independence: 2 variables.
Regression Inference
t = b/SEb. Conditions: Linear, Independent, Normal residuals, Equal spread.