Measurement, Psychometrics, and Construct Validity Lecture
Data sources and measurement in personality assessment
Data sources for personality assessment (types of data):
- Self-report data (S data): the majority source in personality tests; respondents answer directly about themselves.
- Informant report data (I data): ratings or information provided by others who know the person.
- Behavioral data / behavior observations (B data): performance on lab tasks or real-world behavior observed and recorded.
- Indicates that tests come from data, regardless of format (informative reports, self-reports, interviews, tasks, questionnaires).
- Three k two, two times the percent test (transcript note): references the various data sources and testing approaches; interpret as a discussion of data types driving measurement.
Projective tests vs. objective (standardized) tests:
- Projective tests rely on a projection mechanism to infer internal states from ambiguous stimuli; open-ended responses.
- Objective tests use fixed items with standardized scoring.
- Projective tests emphasize an ideographic approach (focusing on the individual case) and symbolic expression; require substantial training to interpret responses.
- Projective tests are often embedded in broader assessments that include life history information and possibly intelligence testing; results are integrated with other data sources.
Projective tests: core concepts and assumptions
- Zygote projection hypothesis (as presented): people project their inner world onto stimuli; behavior reflects unconscious processes.
- Key assumptions of projective tests:
- Unconscious processes drive behavior that can be inferred from responses to stimuli.
- The test is designed to reveal a holistic view of one individual at a time (ideographic approach).
- Symbolic expression: psychological content emerges as symbolic responses; interpretation requires extensive training.
- Examples discussed:
- Rorschach inkblot test:
- Administered for about ~45 minutes per blot.
- Tasks involve interpreting who is in the image, what they are doing, what happened before, what might happen next.
- Observers note details such as attention to people vs. background, and interpret potential personality implications.
- The example prompts reflection on whether differing attentional focus (people vs. setting) indicates personality differences; empirical relevance remains a topic of debate.
- Draw-A-Person test (DAP):
- Often used with children; interpretations attempt to infer personality traits.
- Guidelines for interpretation vary widely and can be contradictory.
- Factors considered include placement on the page (center vs. quadrants), line quality (how crisp the lines are), and pencil pressure.
- The interpretation is influenced by developmental stage and age; generalizability and accuracy are debated.
- Practical note: the interpretive process is trained and subject to variability; projective data are integrated with other information to form a comprehensive understanding of the person.
Objective tests: characteristics and examples
- Core features of objective tests:
- Standardized items: fixed questions or statements with predetermined response scales (e.g., true/false, multiple choice).
- Reliability: results should be consistent across time, settings, and persons (test–retest reliability, internal consistency, etc.).
- Construct validity: the test should measure the intended construct.
- Illustrative measures:
- Big Five Inventory (BFI): a widely used personality inventory assessing broad trait dimensions.
- Minnesota Multiphasic Personality Inventory (MMPI): a comprehensive instrument with a large item pool (e.g., >400 questions) and an extensive interpretive framework (scales/constellations) for various psychopathology and personality constructs.
- Description in transcript: a huge kit with a booklet of questions and a large reference book; items load onto larger constructs; interpretation involves identifying which construct is most strongly represented.
- A non-language, image-based response task (example discussed): presents stimuli as images; requires users to click one of two buttons; records response time across ~100 trials; interprets trait structure via reaction times rather than verbal responses.
- Response formats and response processes in objective tests:
- Likert-type scales: the most common response format (e.g., strongly disagree to strongly agree).
- Visual Analogue Scale (VAS): e.g., a bar from 0 to 10 (or 0–100) used to rate intensity (commonly used in clinical settings for pain, but can be applied to other attributes).
- Semantic differential scales: contrastive pairs (e.g., good–bad, active–passive) on a multi-point scale.
- Forced-choice / other option formats: respondents choose from a set of statements that are similarly valenced.
- Response biases and potential distortions:
- Social desirability and impression management: respondents may present themselves in a more favorable light.
- Extreme responding or response style biases: tendency to select extreme options or consistently agree/disagree.
- Malingering: intentional faking of symptoms, particularly relevant in forensic or clinical settings.
- Psychometric properties and validity considerations:
- Construct validity: how well the measure reflects the intended construct; convergent vs. discriminant validity concepts.
- Convergent validity: measures related constructs should correlate positively but not perfectly.
- Discriminant validity: measures with distinct constructs should not correlate too strongly.
- The challenge of correlating with similar but distinct constructs (e.g., neuroticism/negative affect vs. worry) where some overlap is expected, but they are not identical.
- Inter-source agreement: differences between self-ratings and informant ratings; how strongly one source’s rating aligns with another’s can vary by construct and visibility.
- Reliability concerns:
- Test–retest reliability (stability over time): important for traits that are relatively stable.
- Split-half reliability: correlating two halves of a test; caveat: how items are split can affect the correlation; not always reliable as a sole indicator.
- Internal consistency: degree to which items on a scale measure the same construct; commonly reported via Cronbach's alpha, denoted as $$oxed{ ext{Cronbach's } oldsymbol{oldsymbol{eta}} ext{ or } oldsymbol{oldsymbol{\