Measurement, Psychometrics, and Construct Validity Lecture

Data sources and measurement in personality assessment

  • Data sources for personality assessment (types of data):

    • Self-report data (S data): the majority source in personality tests; respondents answer directly about themselves.
    • Informant report data (I data): ratings or information provided by others who know the person.
    • Behavioral data / behavior observations (B data): performance on lab tasks or real-world behavior observed and recorded.
    • Indicates that tests come from data, regardless of format (informative reports, self-reports, interviews, tasks, questionnaires).
    • Three k two, two times the percent test (transcript note): references the various data sources and testing approaches; interpret as a discussion of data types driving measurement.
  • Projective tests vs. objective (standardized) tests:

    • Projective tests rely on a projection mechanism to infer internal states from ambiguous stimuli; open-ended responses.
    • Objective tests use fixed items with standardized scoring.
    • Projective tests emphasize an ideographic approach (focusing on the individual case) and symbolic expression; require substantial training to interpret responses.
    • Projective tests are often embedded in broader assessments that include life history information and possibly intelligence testing; results are integrated with other data sources.
  • Projective tests: core concepts and assumptions

    • Zygote projection hypothesis (as presented): people project their inner world onto stimuli; behavior reflects unconscious processes.
    • Key assumptions of projective tests:
    • Unconscious processes drive behavior that can be inferred from responses to stimuli.
    • The test is designed to reveal a holistic view of one individual at a time (ideographic approach).
    • Symbolic expression: psychological content emerges as symbolic responses; interpretation requires extensive training.
    • Examples discussed:
    • Rorschach inkblot test:
      • Administered for about ~45 minutes per blot.
      • Tasks involve interpreting who is in the image, what they are doing, what happened before, what might happen next.
      • Observers note details such as attention to people vs. background, and interpret potential personality implications.
      • The example prompts reflection on whether differing attentional focus (people vs. setting) indicates personality differences; empirical relevance remains a topic of debate.
    • Draw-A-Person test (DAP):
      • Often used with children; interpretations attempt to infer personality traits.
      • Guidelines for interpretation vary widely and can be contradictory.
      • Factors considered include placement on the page (center vs. quadrants), line quality (how crisp the lines are), and pencil pressure.
      • The interpretation is influenced by developmental stage and age; generalizability and accuracy are debated.
    • Practical note: the interpretive process is trained and subject to variability; projective data are integrated with other information to form a comprehensive understanding of the person.
  • Objective tests: characteristics and examples

    • Core features of objective tests:
    • Standardized items: fixed questions or statements with predetermined response scales (e.g., true/false, multiple choice).
    • Reliability: results should be consistent across time, settings, and persons (test–retest reliability, internal consistency, etc.).
    • Construct validity: the test should measure the intended construct.
    • Illustrative measures:
    • Big Five Inventory (BFI): a widely used personality inventory assessing broad trait dimensions.
    • Minnesota Multiphasic Personality Inventory (MMPI): a comprehensive instrument with a large item pool (e.g., >400 questions) and an extensive interpretive framework (scales/constellations) for various psychopathology and personality constructs.
      • Description in transcript: a huge kit with a booklet of questions and a large reference book; items load onto larger constructs; interpretation involves identifying which construct is most strongly represented.
    • A non-language, image-based response task (example discussed): presents stimuli as images; requires users to click one of two buttons; records response time across ~100 trials; interprets trait structure via reaction times rather than verbal responses.
    • Response formats and response processes in objective tests:
    • Likert-type scales: the most common response format (e.g., strongly disagree to strongly agree).
    • Visual Analogue Scale (VAS): e.g., a bar from 0 to 10 (or 0–100) used to rate intensity (commonly used in clinical settings for pain, but can be applied to other attributes).
    • Semantic differential scales: contrastive pairs (e.g., good–bad, active–passive) on a multi-point scale.
    • Forced-choice / other option formats: respondents choose from a set of statements that are similarly valenced.
    • Response biases and potential distortions:
    • Social desirability and impression management: respondents may present themselves in a more favorable light.
    • Extreme responding or response style biases: tendency to select extreme options or consistently agree/disagree.
    • Malingering: intentional faking of symptoms, particularly relevant in forensic or clinical settings.
    • Psychometric properties and validity considerations:
    • Construct validity: how well the measure reflects the intended construct; convergent vs. discriminant validity concepts.
      • Convergent validity: measures related constructs should correlate positively but not perfectly.
      • Discriminant validity: measures with distinct constructs should not correlate too strongly.
    • The challenge of correlating with similar but distinct constructs (e.g., neuroticism/negative affect vs. worry) where some overlap is expected, but they are not identical.
    • Inter-source agreement: differences between self-ratings and informant ratings; how strongly one source’s rating aligns with another’s can vary by construct and visibility.
    • Reliability concerns:
      • Test–retest reliability (stability over time): important for traits that are relatively stable.
      • Split-half reliability: correlating two halves of a test; caveat: how items are split can affect the correlation; not always reliable as a sole indicator.
      • Internal consistency: degree to which items on a scale measure the same construct; commonly reported via Cronbach's alpha, denoted as $$oxed{ ext{Cronbach's } oldsymbol{oldsymbol{eta}} ext{ or } oldsymbol{oldsymbol{\