MS

PSY301 Midterm 2

Exam 2 Study Guide



MEASUREMENT RELIABILITY AND VALIDITY 

What are the broad types of measures, what are their advantages and disadvantages? Can you give an example of each kind?

  • Self-report measure - a method of measuring a variable in which people answer questions about themselves in a questionnaire or interview. 

    • useful when the claim is about the nature of people’s beliefs and opinions

    • often used to make frequency claims

    • mix up the style of questions you present

  • Observational measure - a method of measuring a variable by recording observable behaviors or physical traces of behavior (also called behavioral measure)

    • example: a researcher could operationalize happiness by observing how many times a person smiles or stress could be measured by the number of tooth marks on someone's pencil

  • Physiological measure - a method of measuring a variable by recording biological data

    • example: moment-to-moment happiness has been measured using facial electromyography, fMRI, etc. 

  • What are the basic Scales of measurement?  be able to define and identify and provide examples

    • Categorical variables

      • Nominal scale - levels are categories, a variable is divided into two or more categories

        • agree or disagree

        • yes or no

    • Quantitative variables

      • Ordinal scale - when the numerals of a quantitative variable represent a ranked other

        • example: the top ten best books in a library 

      • Interval scale - the numerals represent equal intervals (distances) between levels, and second, there is no “true zero”

        • a person can get a score of 0 but 0 does not really mean “nothing”

        • an IQ score of 0 does not mean a person has “no intelligence”

      • Ratio scale - when numerals of a quantitative variable have equal intervals and when the value of 0 truly means “none” or “nothing”

        • measuring a test score with a score of 0 means the person got “nothing correct”

  • Reliability (establishes construct validity): what are the three kinds of reliability? How are they used?  Can you give an example of each? 

    • Correlation coefficient (r) - a signal measure ranging from -1 to 1 that indicates the strength and direction of an association between two variables.

  • Slope direction (positive, negative, zero)

  • Strength of relationship

 ​​​​

  • Test-retest reliability - the researcher gets consistent scores every time he or she uses the measure

    • people who take the IQ test the first time should have the same pattern of scores the second time they take the test

  • Inter-rater reliability -  consistent scores are obtained no matter who measures the variable

    • you record 12 smiles from the child in one hour and another researcher also records 12 smiles in that given hour

  • Internal reliability - a study participant gives a consistent pattern or answers, no matter how the researcher has phrased the question

    • people who answer yes or agree on a certain section of a questionnaire should also answer agree in the next few items

    • What is Cronbach’s alpha (coefficient alpha)? a correlation-based statistic that measures a scale’s internal reliability (looking for 0.70 or higher)

      • if the alpha is high, there is good internal reliability and the researcher can sum all the items together 

  • What are the components of measurement validity? 

  • Measurement validity (establishes construct validity)  What are the broad classes? Which ones are empirical? X

    • Face validity - a plausible operationalize of the conceptual variable question

      • if it looks like a good measure, it has face validity

        • example: head circumference has high face validity for measurement of hat size

    • Content validity - a measurement must capture all parts of a defined construct

      • Any operationalization of intelligence should include questions or times to assess each of these seven components.

    • Criterion validity - evaluates whether the measure under consideration is associated with a concrete behavioral outcome that it should be associated with, according to the perceptual definition

      • company could collect data to tell them how well each of the two amplitude tests is correlated with success in selling

      • Known-groups paradigm - in which researchers see whether scores on the measure can discriminate among two or more groups whose behavior is already confirmed

        • example: to validate the use of salivary cortisol as a measure of stress, a researcher could cample the salivary cortisol levels in two groups. public speaking is recognized as being a stressful situation for most people and therefore, if salivary cortisol is a valid measure of stress, people in the speech group would have higher levels than the people in the audience

      • Correlation method - does your measure correlate with the behavior or outcome of interest

    • Convergent validity - to what extent is your measure associated with other measures of the same construct 

    • Discriminant validity - to what extent is your measure not associated with measures of other constructs 

    • How are reliability and measurement validity related? 

      • A measure can be reliable but not valid. For example, a bathroom scale that always shows a person’s weight as 10 pounds lighter than it actually is is reliable (consistent), but it is not valid (inaccurate)

      • A valid measure must be reliable. If a test is measuring the correct construct but does so inconsistently, it cannot be considered truly valid

SURVEYS AND OBSERVATIONS (CH. 6)

What are the strengths and weaknesses of various kinds of survey instruments, how can they go wrong?

  • Surveys as self-report measures 

    • Survey question format 

      • Open-ended questions - allow respondents to answer any way they would like

        • drawback is the responses must be coded and categorized which can take a long time

        • they can be poorly worded

      • Forced-choice questions - people give their opinion by picking the best of two or more options

        • used in political polls

      • Likert scale - people are presented with a statement and are asked to use a rating scale to indicate their degree of agreement

      • Semantic differential - asked to rate a target object using a numeric scale that is anchored with adjectives

    • Survey question wording

      • Leading questions - wording leads people to a particular response 

      • Double-barreled questions - it asked two questions in one

      • Negatively worded questions - a question contains negative phrasing, which can cause confusion, and reduce the construct validity of the poll

    • Order of survey questions 

      • Affects response quality, engagement, and potential bias.

      • Poor ordering can cause fatigue, confusion, or influence answers.

      • Strategies

        • Funnel Approach: General → Specific for natural thought progression

          • Easy to follow, reduces confusion

          • Early vague questions may be misinterpreted.

        •  Logical Flow: Keep related questions together for clarity

          • g

        • Sensitive Questions Later: Reduce drop-off by placing personal questions at the end

        • Avoid Early Fatigue: Start with easy, engaging questions.

      • Common Pitfalls

        • Priming Effect: Early questions shape later answers

        • Fatigue Effect: Long or complex early questions lead to dropout

        • Anchoring Bias: Previous answers influence later ones.

    • Response set

      • Acquiescence (“yea-saying”) - saying “yes” or “strongly agree” to every item without thinking carefully

        • reverse wording and fence-sitting is a solution

      • Fence-sitting - people often play it safe by choosing the midpoint of the scale 

    • Socially desirable responding (faking good) - people sometimes give the socially desirable response even if this is not what they really think.   giving answers on a survey that makes one look better than one really is.

    • Self-reporting more than you know or remember - occurs when respondents provide answers beyond their actual knowledge or memory, often unintentionally

  • Observational data

    • Inter-rater reliability - consistent scores are obtained no matter who measures the variable

      • you record 12 smiles from the child in one hour and another researcher also records 12 smiles in that given hour

    • Observer bias - a bias that occurs when observer expectations influence the interpretation of participant behaviors or of the outcome of the study

    • Observer effects - a change in the behavior of study participants in the direction of observer expectations

    • Reactivity - a change in the behavior of study participants because they are aware they are being watched

    • Blind (or masked) design - a study design in which the observers are unaware of the experimental conditions to which participants have been assigned

    • Waiting it out -  refers to the strategy used in observational research where the observer remains in a setting for an extended period, allowing subjects to behave naturally without the influence of the observer's presence

    • Unobtrusive observations and data - an observation in a study made indirectly, through physical traces of behavior, or made by someone who is hidden or is posing 



SAMPLING (CH. 7) Know the terms, be able to explain and provide examples of each....

  • Population - is the entire set of people or products in which you are interested

  • sample - is a smaller set, taken from that population

  • census - a set of observations that contains all members of the population of interest

  • Representative sample  - a sample in which all members of the population of  interest are equally likely to be included and therefore the results can generalize to the population   

  • biased samples - a sample in which some members of the population of interest are systematically left out, and therefore the results cannot generalize to the population of interest

  • What makes a sample representative? accurately reflects the characteristics of the larger population from which it is drawn. This means that key demographic, behavioral, or other relevant traits are proportionally included, allowing generalization of findings to the broader group. How can this be achieved? completing random samples or other relevant sample types.  When does it matter most or least? matters the most for generalizing to a population, comparative research, and policy or decision-making. matters the least for exploratory research or in a case study or niche population

  • Random (or probability sampling)  - a method of selecting a group of individuals from a larger population so that each individual has an equal chance of being chosen

  • Random assignment vs. random sampling - random assignment is the use of a random method to assign participants to different experimental groups 

  • Non-random sampling 

    • Convenience sampling - using a sample of people who are easy to contact and readily available to participate 

    • Self-selection - a form of sampling bias that occurs when a sample contains only people who volunteer to participate 

    • Prioritizing external validity for frequency claims - External validity: To whom can the association be generalized? Prioritize it by using a representative sample, consider sampling bias, ensure a large sample size, replicate across different contexts, and account for response biases

  • Sampling technique vs. sample size - random sampling, probability sampling, cluster sampling, multistage sampling, stratified random sampling, random assignment, convenience sampling, snowball sampling, etc. sample size is the number of observations or individuals in a study or experiment. 



BIVARIATE CORRELATIONS (CH. 8)

  • Bivariate correlation

    • Associations with two continuous variables 

    • Associations with one continuous and one categorical variable 

  • Construct validity of association claims - how well was each of the two variables measured

  • Statistical validity of association claims

    • Effect size - all associates are not equal; some are stronger than others. describes the strength of a relationship between two or more variables 

    • Statistical significance - refers to the conclusion a researcher reaches regarding the likelihood of getting correlation of that size just by chance, assuming there's no correlation in the real world

    • Outliers - an extreme score– a single case that stands out from the pack

    • Restriction of range - when there is a full range of scores on one of the variables in the association, and the correlation appear smaller than it really is 

    • Curvilinear associations - when the relationship between two variables in not a straight line; it might be positive up to a point and then become negative

  • Three criteria for a causal claim

    • Covariance - an association establishes A←→ B

    • Temporal precedence (directionality problem) - do we know which came first A→B or B→A

    • Internal validity (third-variable problem) - is there a C variable that is associated with both A and B, independently

      • Spurious association - a bivariate association that is attributable only to systematic mean differences on subgroups within the sample; the original association is not present within the subgroups

  • External validity of association claims

    • Moderator - a variable that, depending on its level, changes the relationship between two other variables

-----------------------------------------------------------------------------

MULTIVARIATE CORRELATIONS (CH. 9)

  • Multivariate design- a study designed to test an association involving more than two measured variables  

  • Longitudinal design - a study in which the same variables are measured in the same people at different points in time

  • Longitudinal design: 

  • What are they used for? used to track changes over time, establish causality, study developmental trends, examine long-term effects, etc. 

  • How are they set up? select a population, choose time points, measure variables, and control for attrition. 

  • Can you interpret the data from one? yes!

    • Cross-sectional correlation - test to see whether two variables, measured at the same point in time, are correlated. 

    • Autocorrelation - in a longitudinal design, the correlation of one variable with itself, measured at two different times

    • Cross-lag correlation - in a longitudinal design, a correlation between an earlier measure of one variable and a later measure of another variable.

  • Multiple regression - a statistical technique that computes criterion variable, controlling for other predictor variables 

  • Multiple regression

    • Controlling for another variable - holding a potential third variable at a constant level while investigating the association between two other variables 

    • Criterion variable - the variable in a multiple regression analysis that the researchers are most interested in understanding or predicting 

    • Predictor variables - a variable in multiple regression analysis that is used to explain variance in the criterion (also called independent variable)

    • Beta - for each predictor variable, we will compute beta (β). Like “r” from bivariate correlations, but “controls for” the other predictor variables in the analysis

  • Pattern and parsimony - Pattern refers to identifying relationships between multiple independent variables (predictors) and a dependent variable. By analyzing which predictors are significant and how they interact, researchers can detect patterns in the data. Parsimony means creating the simplest model that still explains the data well. The goal is to avoid overfitting by including only the most relevant predictors.

  • Mediator - a variable that helps explain the relation between two other variables