Acronyms
SOCS - Describing univariate data
Shape, Outliers, Center/Context, Spread
“When looking at (context), we see that the distribution has (shape). The center is (type of mode) with a (mean/median) of ____ (unit), giving a spread of (SD or IQR (units)). Using the (SD/IQR Rule), there (are/aren’t) outliers.”
SOFA - Describing bivariate data
Shape, Outliers, Form, Association
"When analyzing the relationship between (context), we observe that the data displays a (shape) distribution. The overall trend appears to be (form), suggesting an (association) between the variables. There appear to (be/not be) any influential/unusual points."
BINS - Binomial Distribution Conditions
Bimodal (success/failure), Independent (one trial doesn’t impact the next), Number of Trials (fixed), Success Probability (constant across trials).
SIN - Confidence Interval + Significance Test Conditions
Selection (random), Independent, Normally distributed
For Proportions:
First, do BINS.
Then, SIN - Independence (10% condition - sample less than 10% of population) and Normality (np and nq > 10)
For Means:
SIN - Independence (10% condition) and Normality (Central Limit Theorem - mentioned or n > 25/30
PANIC - Confidence Interval Procedure - Parameter of interest, Assumptions (SIN), Name the test, Interval, Conclude in context
P - “The parameter of interest is the true/population/global (proportion/mean) of (context).”
A - SIN (+BINS if prop)
N - “The test is a (one prop z interval, one sample t interval, 2-prop z interval, 2-sample t interval, etc.)”
I - calculator work - label the calculator work
C - “We are C% confident that the true (copy parameter of interest) is contained within (calculated interval)
PHANTOMS - Significance/Hypothesis Test Procedure - Parameter of interest, Hypotheses, Assumptions (SIN), Name the test, Test statistic, Obtain p-value, Make a decision, State conclusion in context.
P - “The parameter of interest is the true/population/global (proportion/mean) of (context).”
H - H0: p/μ = x, Ha: p/μ (> < ≠) x
A - SIN (+BINS if prop)
N - “The test is a one prop z test, one sample t test, 2 prop z test, 2 sample t test for difference of means, etc.)
T + O - math + calculator work. Label all calculator work
M - reject or fail to reject the null hypothesis - if p > \alpha , fail to reject, if p < \alpha , reject the null
S - “Since p is less than \alpha , there is convincing evidence that (Ha in context)” OR “Since p is greater than \alpha , there is not convinceing evidence that (Ha in context)”
LINER - Slope Significance Test/Confidence Interval Conditions - Linear, Independent, Normal, Equal Variance, Random sampling
Sentences
Describing Univariate Data - “When looking at (context), we see that the distribution has (shape). The center is (type of mode) with a (mean/median) of ____ (unit), giving a spread of (SD or IQR (units)). Using the (SD/IQR Rule), there (are/aren’t) outliers.”
Describing Bivariate Data - "When analyzing the relationship between (context), we observe that the data displays a (shape) distribution. The overall trend appears to be (form), suggesting an (association) between the variables. There appear to (be/not be) any influential/unusual points."
Coefficient of Determination Interpretation - R^2 - “(R² as percentage)% of the variation of (y variable in context) is explained by the LSRL of the (x variable in context) in relation to the (y in context).”
Y-Intercept of Regression Line Interpretation - “At (x in context) value of 0 units, our model predicts a (y in context) value of (y-int) units.”
Slope of Regression Line Interpretation - “For every 1 unit increase in (x in context), our model predicts an average increase of (y in context).”
Confidence Interval Interpretation - “We are C% confident that the (parameter of interest) is contained in (calculated interval).”
Confidence Level Interpretation - “In the long run and many repeated trials, about C% of confidence intervals will contain the (parameter of interest).”
P-Value Interpretation - “The p-value (p-value) is the probability of getting a result as extreme or more extreme than (observed statistic).”
Fail to Reject the Null Hypothesis - “Since p is greater than \alpha , there is not convinceing evidence that (Ha in context).”
Reject the Null of Hypothesis - “Since p is less than \alpha , there is convincing evidence that (Ha in context).”
Things to Remember
Type I Error - Reject the H0 when it is true - \alpha = probability of T1 Error
Type II Error - Fail to Reject Ho when it is false - \beta = probability of T2 Error
Power = 1 - \beta
Chi-Squared Tests
Mean of \chi2 is DF
Assumptions
Random Sample, Fixed Size, 10% condition (Independence), All expected counts > 5
Goodness of Fit (GOF)
DF = n-1
H0: distribution among level of DF variable is as expected
Ha: at least 1 variable is not as expected
2 Way Tests
DF = (# of rows - 1)(# of columns -1)
Test for Homogeneity
2 samples, 1 variable
H0: distributions for a variable are the same among populations
Ha: at least 1 distribution is different
Test for Independence/Association
1 sample, multiple variables
H0: distributions in the population are independent of a variable
Ha: distributions in a population have an association/are dependent
Slopes
Conditions - LINER
T Interval - b1 ± t* SEb1
T Test - t = (b1-hypothesized slope)/SEb1 DF = n-2