1/15
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Bivariate correlations
Associations or relationships between exactly two variables
Association claim - two variables
A study that reports a bivariate correlation may have measured more than two variables
Special case: categorical data
Scatterplots work well with quantitative data
Can have issues with ordinal data
But scatterplots are often less clear for categorical data → ex. If you only have two categories (cant make helpful scatterplot)
Bar graphs can be used instead
Allows visual comparison of group means
Can use r for categorical data, but its more common to estimate the magnitude of difference between group averages
T-test
statistic to test the differences between two group averages
Peaks are averages, the higher the peak, the more significant differences (lower p values) and vice versa
Not exclusive to experiments or association claims
Statistical validity
How strong is the relationship?
How precise is the estimate?
Has it been replicated?
Are outliers affecting the association?
Is there a restriction of range?
Is the association curvilinear?
Effect size
measure of the strength of the relationship between two variables in a population
All else being equal, larger effect size are more meaningful (stronger relationship)
Small effect sizes are important too
Effect size magnitude
Useful for predicting where someone’s data point will land on the prediction line in a scatter plot
When everything is equal, larger effect sizes are usually considered more important
Some exceptions
How precise is the estimate?
95% confidence interval (95% CI)
Range will contain the true population value 95% of the time
Better that CI does not contain zero in the range
CI that does contain zero? → possibility that there is no association between the two variables
Smaller CI is better
Error bars often represent one standard deviation of uncertainty, one standard error, or a particular CI → the narrower error bars = the more precise it is (better)
Estimates based on smaller samples are less stable → wider and less precise CIs
Has it been replicated?
Replication allows for more accurate findings
Others doing smth similar and found similar results
True variations and different variations
More generalization = better external validity
Are outliers affecting the association?
Outliers: extreme scores that lies far away from the rest of the scores
Outlier in larger sample → smaller effect on the overall sample
Outlier in smaller sample → larger effect on the overall sample
Is there a restriction of range?
restriction of range that prevent us from getting the full sample
Can influence conclusions
Ex. SAT scores cut off for college is 1200 → not the full sample
Is the association curvilinear?
Do the data points make a curve/u-shape on the scatter plot?
Nonlinear → not meant to be captured by correlations
Is it statistically significant?
Statistical significance: conclusions drawn about how probable it is that a correlation that size would come from a population with no correlation
Tldr: are these results due to chance?
Logic behind statistical inference
Researches collect data from a sample to make conclusions about a larger population
If there is association in the sample, we assume association exists in the population
If no association in population, association in sample may have been chance
Significance in journal articles
Usually tell us
P = 0.05 is usually what to expect
Can be recorded in different ways - may make it difficult to read through methods
Internal validity
Can we make a causal inference from this association?
3 causal criteria:
Covariance: do results show that the variables are correlated?
Temporal precedence: does method establish which variable came first in time?
Interval validity (third-variable problem): is there a third variable that is associated with the other two variables independently?
When a 3rd variable is a problem
Ex. hair length and height → shorter = longer hair length
third variable is gender of the individuals
weight and height → taller = heavier
gender of the individuals is a factor but is not a third variable
Moderating variables
When the relationship between two variables changes depending on the level of another variable.
The other variable is called a moderator
Ex. association between team success and game attendance
Place A has higher association
Place B has lowe association
But moderator is residential mobility (transience - do people move there and live there for a long time?
High transience/residential mobility = higher association
Low transience/residential mobility = lower association