1/43
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Categorical Data
Grouped in categories
Quantitative Data
Group by numerical values
Frequency Table
Number of individuals in a table
Relative Frequency Table
The proportion/percentage of individuals having each value
Marginal distributions
Shows the probability of a single variable (pass/total)
Conditional distributions
Shows the probability of one variable given a specific condition on another variable (pass/didn’t study)
Categorical Data Plos
Side by side graphs and segmented bar plots
Quantitative Data Plots
Dot plots, histogram, stem and leaf plot, box plots
Measures of center
Mean: Add and divide
Median: Middle
Modes: Unimodal or bimodal
Measures of spread
Range: Max-min
IQR: Q3-Q1
Standard Deviation
Describe 1 variable data
CSOCS
Context
Shape
Outliers
Center
Spread
Percentiles
Percent of data less than or equal to a certain data value
Z-score
(data point - mean) / standard deviation, is # standard deviations away from the mean
Graph 2 variable quanatative data
Scatterplots: shows correlation with LSRL
Explain 2 variable quanatative data
CDOFS
Context
Direction (pos/neg)
Outliers
Form (linear or non)
Strength (correlation)
Least Square Regression Line
Y=a+bx
a=y intercept
b= slope
The sum of square residuals between the data and model “line of best fit” essentially
Residuals
Actual - predicted
LSRL outliers
High leverage
Influenciable outliers
Coefficient of Determination
Square rooting the r-sq
If R² = 0.75, it means that 75% of the variation in the dependent variable is explained by the independent variable(s) in the regression model
Stratified Random Sample
Divides the population into homogeneous groups (e.g., grade levels) and selects a few individuals from each group.
Systematic Sample
Selects individuals at fixed intervals (e.g., every 3rd person).
Simple Random Sample (SRS)
Every individual has an equal chance of being selected
What does bias lead to
Over or undercoverage of a population
Retrospective study
Examines existing data on individuals
Prospective study
Follows individuals to gather future data (overtime)
Key Components of an Experiment
Experimental Units: The objects or subjects to which treatments are randomly assigned.
Explanatory Variable: The variable that is purposely manipulated in the experiment.
Treatments: The different levels or conditions of the explanatory variable that are applied to the experimental units.
Response Variable: The measured outcome of the experiment, used to compare treatment effects.
Confounding Variable: A factor that may influence the response variable but is not accounted for in the study, potentially skewing results.
Completely randomized design
An experimental design in which experimental units are assigned to treatments completely at random.
Completely randomized block design
Experimental units are first blocked (grouped) by a similar trait that may affect response. Then, units from each block are randomly assigned to treatment
Matched Pairs
Assigning which goes first but measuring with both
Probability formulas ∩ U
Intersection (A ∩ B): Outcomes common to both events (INTERSECTION/ADD)
Union (A U B): Outcomes in A, B, or both (UNION/OR)
Mutually Inclusive Events Formula
Events that can occur at the same time and share at least one outcome
P(A U B)=P(A)+P(B)-P(A ∩ B).
Mutually Exclusive Events
Are events that cannot occur at the same time
P(A ∪ B) = P(A) + P(B)
Test for Independence
P(A∩B) = P(A) P(B)
and P(A)=P(A|B)
Conditional Probability
P(A|B)=P(A∩B)/P(B)=P (both events occur)/P(given event occurs)
Geometric Distribution
Number of trials needed to achieve the first success in a series of independent trials
Binomial Distribution
Number of trials until the first success
CDF
Probability that a random variable is less than or equal to a specific value
Probability that a random variable, say X, will take a value exactly equal to x
Proportions (𝑝̂) tests
1. Random sample (unbiased)
2. 10% Condition (independence)
3. Large counts condition: Np≥10 and n(1−p)≥10 (approx normality)
Means (𝑥̄) tests
1. Random sample (unbiased)
2. 10% Condition (independence)
3. Central Limit Theorem (𝑛 ≥ 30 ensures normality)
Confidence Level
If we were to repeat this process many times, about __% of the confidence intervals we create would contain the true [parameter]
Confidence Interval
"We are __% confident that the true [parameter] is between bound] and [bound]."
Type I Error
Rejecting Ho when it’s actually true.
Type II Error
Fail to reject Ho when it’s actually false