Exam 2 Study Guide
MEASUREMENT RELIABILITY AND VALIDITY
What are the broad types of measures, what are their advantages and disadvantages? Can you give an example of each kind?
Self-report measure - a method of measuring a variable in which people answer questions about themselves in a questionnaire or interview.
useful when the claim is about the nature of people’s beliefs and opinions
often used to make frequency claims
mix up the style of questions you present
Observational measure - a method of measuring a variable by recording observable behaviors or physical traces of behavior (also called behavioral measure)
example: a researcher could operationalize happiness by observing how many times a person smiles or stress could be measured by the number of tooth marks on someone's pencil
Physiological measure - a method of measuring a variable by recording biological data
example: moment-to-moment happiness has been measured using facial electromyography, fMRI, etc.
What are the basic Scales of measurement? be able to define and identify and provide examples
Categorical variables
Nominal scale - levels are categories, a variable is divided into two or more categories
agree or disagree
yes or no
Quantitative variables
Ordinal scale - when the numerals of a quantitative variable represent a ranked other
example: the top ten best books in a library
Interval scale - the numerals represent equal intervals (distances) between levels, and second, there is no “true zero”
a person can get a score of 0 but 0 does not really mean “nothing”
an IQ score of 0 does not mean a person has “no intelligence”
Ratio scale - when numerals of a quantitative variable have equal intervals and when the value of 0 truly means “none” or “nothing”
measuring a test score with a score of 0 means the person got “nothing correct”
Reliability (establishes construct validity): what are the three kinds of reliability? How are they used? Can you give an example of each?
Correlation coefficient (r) - a signal measure ranging from -1 to 1 that indicates the strength and direction of an association between two variables.
Slope direction (positive, negative, zero)
Strength of relationship
Test-retest reliability - the researcher gets consistent scores every time he or she uses the measure
people who take the IQ test the first time should have the same pattern of scores the second time they take the test
Inter-rater reliability - consistent scores are obtained no matter who measures the variable
you record 12 smiles from the child in one hour and another researcher also records 12 smiles in that given hour
Internal reliability - a study participant gives a consistent pattern or answers, no matter how the researcher has phrased the question
people who answer yes or agree on a certain section of a questionnaire should also answer agree in the next few items
What is Cronbach’s alpha (coefficient alpha)? a correlation-based statistic that measures a scale’s internal reliability (looking for 0.70 or higher)
if the alpha is high, there is good internal reliability and the researcher can sum all the items together
What are the components of measurement validity?
Measurement validity (establishes construct validity) What are the broad classes? Which ones are empirical? X
Face validity - a plausible operationalize of the conceptual variable question
if it looks like a good measure, it has face validity
example: head circumference has high face validity for measurement of hat size
Content validity - a measurement must capture all parts of a defined construct
Any operationalization of intelligence should include questions or times to assess each of these seven components.
Criterion validity - evaluates whether the measure under consideration is associated with a concrete behavioral outcome that it should be associated with, according to the perceptual definition
company could collect data to tell them how well each of the two amplitude tests is correlated with success in selling
Known-groups paradigm - in which researchers see whether scores on the measure can discriminate among two or more groups whose behavior is already confirmed
example: to validate the use of salivary cortisol as a measure of stress, a researcher could cample the salivary cortisol levels in two groups. public speaking is recognized as being a stressful situation for most people and therefore, if salivary cortisol is a valid measure of stress, people in the speech group would have higher levels than the people in the audience
Correlation method - does your measure correlate with the behavior or outcome of interest
Convergent validity - to what extent is your measure associated with other measures of the same construct
Discriminant validity - to what extent is your measure not associated with measures of other constructs
How are reliability and measurement validity related?
A measure can be reliable but not valid. For example, a bathroom scale that always shows a person’s weight as 10 pounds lighter than it actually is is reliable (consistent), but it is not valid (inaccurate)
A valid measure must be reliable. If a test is measuring the correct construct but does so inconsistently, it cannot be considered truly valid
SURVEYS AND OBSERVATIONS (CH. 6)
What are the strengths and weaknesses of various kinds of survey instruments, how can they go wrong?
Surveys as self-report measures
Survey question format
Open-ended questions - allow respondents to answer any way they would like
drawback is the responses must be coded and categorized which can take a long time
they can be poorly worded
Forced-choice questions - people give their opinion by picking the best of two or more options
used in political polls
Likert scale - people are presented with a statement and are asked to use a rating scale to indicate their degree of agreement
Semantic differential - asked to rate a target object using a numeric scale that is anchored with adjectives
Survey question wording
Leading questions - wording leads people to a particular response
Double-barreled questions - it asked two questions in one
Negatively worded questions - a question contains negative phrasing, which can cause confusion, and reduce the construct validity of the poll
Order of survey questions
Affects response quality, engagement, and potential bias.
Poor ordering can cause fatigue, confusion, or influence answers.
Strategies
Funnel Approach: General → Specific for natural thought progression
Easy to follow, reduces confusion
Early vague questions may be misinterpreted.
Logical Flow: Keep related questions together for clarity
g
Sensitive Questions Later: Reduce drop-off by placing personal questions at the end
Avoid Early Fatigue: Start with easy, engaging questions.
Common Pitfalls
Priming Effect: Early questions shape later answers
Fatigue Effect: Long or complex early questions lead to dropout
Anchoring Bias: Previous answers influence later ones.
Response set
Acquiescence (“yea-saying”) - saying “yes” or “strongly agree” to every item without thinking carefully
reverse wording and fence-sitting is a solution
Fence-sitting - people often play it safe by choosing the midpoint of the scale
Socially desirable responding (faking good) - people sometimes give the socially desirable response even if this is not what they really think. giving answers on a survey that makes one look better than one really is.
Self-reporting more than you know or remember - occurs when respondents provide answers beyond their actual knowledge or memory, often unintentionally
Observational data
Inter-rater reliability - consistent scores are obtained no matter who measures the variable
you record 12 smiles from the child in one hour and another researcher also records 12 smiles in that given hour
Observer bias - a bias that occurs when observer expectations influence the interpretation of participant behaviors or of the outcome of the study
Observer effects - a change in the behavior of study participants in the direction of observer expectations
Reactivity - a change in the behavior of study participants because they are aware they are being watched
Blind (or masked) design - a study design in which the observers are unaware of the experimental conditions to which participants have been assigned
Waiting it out - refers to the strategy used in observational research where the observer remains in a setting for an extended period, allowing subjects to behave naturally without the influence of the observer's presence
Unobtrusive observations and data - an observation in a study made indirectly, through physical traces of behavior, or made by someone who is hidden or is posing
SAMPLING (CH. 7) Know the terms, be able to explain and provide examples of each....
Population - is the entire set of people or products in which you are interested
sample - is a smaller set, taken from that population
census - a set of observations that contains all members of the population of interest
Representative sample - a sample in which all members of the population of interest are equally likely to be included and therefore the results can generalize to the population
biased samples - a sample in which some members of the population of interest are systematically left out, and therefore the results cannot generalize to the population of interest
What makes a sample representative? accurately reflects the characteristics of the larger population from which it is drawn. This means that key demographic, behavioral, or other relevant traits are proportionally included, allowing generalization of findings to the broader group. How can this be achieved? completing random samples or other relevant sample types. When does it matter most or least? matters the most for generalizing to a population, comparative research, and policy or decision-making. matters the least for exploratory research or in a case study or niche population
Random (or probability sampling) - a method of selecting a group of individuals from a larger population so that each individual has an equal chance of being chosen
Random assignment vs. random sampling - random assignment is the use of a random method to assign participants to different experimental groups
Non-random sampling
Convenience sampling - using a sample of people who are easy to contact and readily available to participate
Self-selection - a form of sampling bias that occurs when a sample contains only people who volunteer to participate
Prioritizing external validity for frequency claims - External validity: To whom can the association be generalized? Prioritize it by using a representative sample, consider sampling bias, ensure a large sample size, replicate across different contexts, and account for response biases
Sampling technique vs. sample size - random sampling, probability sampling, cluster sampling, multistage sampling, stratified random sampling, random assignment, convenience sampling, snowball sampling, etc. sample size is the number of observations or individuals in a study or experiment.
BIVARIATE CORRELATIONS (CH. 8)
Bivariate correlation
Associations with two continuous variables
Associations with one continuous and one categorical variable
Construct validity of association claims - how well was each of the two variables measured
Statistical validity of association claims
Effect size - all associates are not equal; some are stronger than others. describes the strength of a relationship between two or more variables
Statistical significance - refers to the conclusion a researcher reaches regarding the likelihood of getting correlation of that size just by chance, assuming there's no correlation in the real world
Outliers - an extreme score– a single case that stands out from the pack
Restriction of range - when there is a full range of scores on one of the variables in the association, and the correlation appear smaller than it really is
Curvilinear associations - when the relationship between two variables in not a straight line; it might be positive up to a point and then become negative
Three criteria for a causal claim
Covariance - an association establishes A←→ B
Temporal precedence (directionality problem) - do we know which came first A→B or B→A
Internal validity (third-variable problem) - is there a C variable that is associated with both A and B, independently
Spurious association - a bivariate association that is attributable only to systematic mean differences on subgroups within the sample; the original association is not present within the subgroups
External validity of association claims
Moderator - a variable that, depending on its level, changes the relationship between two other variables
-----------------------------------------------------------------------------
MULTIVARIATE CORRELATIONS (CH. 9)
Multivariate design- a study designed to test an association involving more than two measured variables
Longitudinal design - a study in which the same variables are measured in the same people at different points in time
Longitudinal design:
What are they used for? used to track changes over time, establish causality, study developmental trends, examine long-term effects, etc.
How are they set up? select a population, choose time points, measure variables, and control for attrition.
Can you interpret the data from one? yes!
Cross-sectional correlation - test to see whether two variables, measured at the same point in time, are correlated.
Autocorrelation - in a longitudinal design, the correlation of one variable with itself, measured at two different times
Cross-lag correlation - in a longitudinal design, a correlation between an earlier measure of one variable and a later measure of another variable.
Multiple regression - a statistical technique that computes criterion variable, controlling for other predictor variables
Multiple regression
Controlling for another variable - holding a potential third variable at a constant level while investigating the association between two other variables
Criterion variable - the variable in a multiple regression analysis that the researchers are most interested in understanding or predicting
Predictor variables - a variable in multiple regression analysis that is used to explain variance in the criterion (also called independent variable)
Beta - for each predictor variable, we will compute beta (β). Like “r” from bivariate correlations, but “controls for” the other predictor variables in the analysis
Pattern and parsimony - Pattern refers to identifying relationships between multiple independent variables (predictors) and a dependent variable. By analyzing which predictors are significant and how they interact, researchers can detect patterns in the data. Parsimony means creating the simplest model that still explains the data well. The goal is to avoid overfitting by including only the most relevant predictors.
Mediator - a variable that helps explain the relation between two other variables