psych 209: exam 2
measurement reliability = consistency
test-retest: if tested again, the results/patterns will be similar/dissimilar
internal: will generate similar responses across all of the items, even with different wording
average inter-item correlation (AIC): mean of all possible correlations
i.e., AIC between 0.15-0.50 have a reasonable correlation
cronbach’s alpha: combines AIC and the # of items in the scale
i.e., the closer to 1.0, the more reliable the scale
interrater: different interpretations of the same sentence/prompt
whether results are uniform when multiple administrators use the measure
r: 0.70 or greater for strong, positive correlation
kappa: when rating categorical variables; closer to 1.0 is a stronger correlation
measurement validity = accuracy
construct: how well variables have been operationalized
face: subjective; does this seem plausible?
e.g., layperson may be confused or question construct validity compared to an expert
content: subjective; does it cover all the (relevant) content?
e.g., a quiz or survey covering only some of the relevant material vs. all or comprehensively
criterion: empirical; does it correlate with behavior (the way it should theoretically)?
e.g., measuring class with relevant behaviors (arriving on time, using a planner, etc.), vs. irrelevant behaviors
known-groups paradigm: testing two groups who are known to differ on the measured variable to ensure the score differently on that variable
convergent: empirical; does it correlate with similar metric.. do they converges?
e.g., measuring movie quality with imdb, rotten tomatoes, etc., scores (commonly used measurements)
discriminant/divergent: empirical; does it correlate with dissimilar metrics?
e.g., ensuring that measurement used for “itch” does not converge with measurement used for “pain”
prevents confusion and irrelevant overlap
Study Validity (Concerns about the Research Design)
Internal Validity – The degree to which a study establishes a cause-and-effect relationship between variables, without confounding factors.
External Validity – The extent to which the study’s findings generalize to other settings, populations, or times.
Ecological Validity – How well the findings apply to real-world settings.
Statistical Conclusion Validity – The extent to which conclusions drawn from statistical analysis are accurate and reliable.
sampling validity
statistical validity: findings are precise, reasonable, and replicable
requires big enough sample
Cares which wolves are sampled and how (the number doesn’t matter)
larger sample size → smaller CI
larger sample size → more precise estimate of variable of interest
external validity: findings can be generalized to other contexts
requires right kind of sample
Doesn’t care which wolves are sampled, just how many are sampled
generalizability: extent we can apply claims of sample to entire population of interest
probability sampling: lets randomness determine who/what is sampled
simple random: list of every single member of population & choose sample randomly
systematic: choose sample based on a system (e.g., every 5th person)
stratified random: dividing population into meaningful subgroups (strata) and sample randomly from subgroups
cluster: random select naturally occurring clusters → randomly selecting people within those clusters
nonprobability sampling: researcher/participants choose who/what is sampled
response sets
non-differentiation: picking the same thing every time
solution: mix up the wording
acquiescence (yea-saying): agree with everything
solution: including attention check questions
fence-sitting: picking the neutral option every time
solution: eliminate neutral option on likert scales
observational validity solutions
observer bias → clear codebooks: precise operationalization of variables
observer bias/effects → masked (blind) design: observers are unaware of study’s purpose and conditions
reactivity → unobtrusive observations: making observers less noticeable
probability sampling: lets randomness determine who/what is sampled
simple random: list of every single member of population & choose sample randomly
systematic: choose sample based on a system (e.g., every 5th person)
stratified random: dividing population into meaningful subgroups (strata) and sample randomly from subgroups
oversampling: intentionally over representing certain subgroups
cluster: random select naturally occurring clusters → randomly selecting people within those clusters
multistage: includes 2 samples/stages
random sample of clusters
random sample of people within clusters
nonprobability sampling: researcher/participants choose who/what is sampled
convenience: sampling only easily accessible/available participants; most common sampling technique
purposive: only certain kinds of people included in sample
snowball: participants asked to recommend acquaintances for study
quota: identifies subsets of population and sets target number for each category in sample; samples nonrandomly until quotas are filled