Personality & Individual Differences Lec2
Psychological Measurement
How to measure personality?
How to ensure:
Meaningful characteristics are measured?
The measured characteristic is the one intended to be measured?
Aim: To make meaningful comparisons among people and calculate statistics (e.g., investigate relationships between variables).
No meaningful âzeroâ level.
No absolute amounts of a variable.
No use of ratios.
Well-designed personality measurements: Equal differences between scores â equal differences in trait levels (with âââ approximating).
1.1 Some Simple Statistical Ideas
1.1.1 Levels of Measurement (aka Scales of Measurement)
Nominal: Data can only be categorized.
Ordinal: Data can be ranked.
Interval: Data can be ranked and evenly spaced.
Ratio: Data can be ranked, evenly spaced, and have a natural zero.
Note: Data from personality scales (e.g., someoneâs neuroticism score) are somewhere between ordinal and interval level but are usually treated as if they were interval level in statistical analyses.
1.1.2 Standard Scores
To make meaningful comparisons between scores, âraw scoresâ are converted to standard scores (e.g., by subtracting the mean [M] from the score and then dividing the result by the standard deviation [SD]).
Example: Standardized IQ test scores with a M of 100 and SD of 15.
1.1.3 Correlation Coefficients, r
Tell us how strongly two variables are relatedâand in which direction (positive or negative)!
Benchmarks for interpreting correlations in personality & individual differences research:
Large (or high or strong): .40 or larger (in absolute terms).
Moderate: Between .20 and .40 (positive or negative).
Small (or weak): Between -.20 and .20.
No correlation: r â .00.
1.1.4 Sample Representativeness & Sample Size
Sample Representativeness
Samples should be reasonably representative of the population that the researcher wants to learn about.
Potential problems if samples are:
Psychology undergraduate students.
âWEIRDâ = from Western, Educated, Industrialized, Rich, Democratic societies and show restricted variance (correlations are standardized covariances, so restricted variance â restricted correlations!).
Sample Size
Correlations from samples of ℠250 people are usually close to the population correlation (Schönbrodt & Perugini, 2013).
The larger the sample, the greater the statistical power to obtain statistically significant results.
1.2 Assessing the Quality of Measurement: Reliability and Validity
1.2.1 Reliability
The extent to which a measure produces consistent results.
Does the obtained score represent the âtrue levelâ of the construct being measured?
1.2.1.1 Internal-Consistency Reliability
The extent to which the items of a measure are correlated with one other.
Cronbachâs alpha (α); â„ .70 are usually considered acceptable.
1.2.1.2. Interrater (Interobserver) Reliability
The extent of consistency between the scores of different raters/observers.
1.2.1.3 TestâRetest Reliability
The extent of consistency between scores across different measurement occasions (e.g., now and 1 year later).
1.2.2 Validity
The extent to which a test measures what it claims to measure.
1.2.2.1 Content Validity
The extent to which a measure assesses all relevant features of the construct and does not assess irrelevant features.
1.2.2.2 Construct Validity: Convergent & Discriminant
The measure assesses the same construct that it is intended to assess.
Convergent validity: Correspondence with measures assessing similar (positive relations) or opposite (negative relations) characteristics.
Discriminant validity: Correspondence with measures assessing characteristics unrelated to the one the measure is intended to assess.
1.2.2.3 Criterion Validity
Relations with relevant outcome variables; also called predictive validity.
1.3 Methods of Measurement
1.3.1 Self-Reports
Structured questionnaires.
Every person/participant is asked the same set of questions or items.
There is a fixed set of response alternatives for every item (note the difference between scale and response scale).
Most widely used method of measuring personality.
Most personality inventories (or questionnaires or scales) assess several personality traits.
Each trait is assessed with several items, allowing for good reliability and content validity.
Many researchers recommend including items that suggest the opposite of the trait (so-called reverse-scored items [R]) â balancing out the tendency to agree or disagree with statements (acquiescence).
Pros:
Efficient, low cost.
Mostly accurate if people know their behaviors, thoughts, and feelings.
Cons:
Can be easily âfakedâ or distorted (e.g., when applying for a job) â socially desirable responding (very difficult to control!).
Extremely valuable as people usually know themselves very well⊠and sometimes are the only ones who know (Baldwin, 2000).
1.3.2 Observer Reports
Analogous to self-reports, but someone else provides the information about the âtargetâ person.
The observer can be a spouse, a parent, a friend, a colleague, a classmate, etc., but should know the âtargetâ fairly well.
Pros: Might be more objective (i.e., less biased) â âOthers (sometimes) know us better than we know ourselvesâ (Vazire & Carlson, 2011).
Cons: Some aspects of personality might never really be observed; observations are done in a limited range of contexts.
1.3.3 Direct Observations
Directly observing a personâs behavior.
Frequency and intensity of behavior that indicate a certain trait.
In the personâs natural habitat or in an artificial setting (e.g., lab).
Can be (very) informative.
Cons: Time-consuming, expensive, require a lot of effort⊠and need to be aggregated (over multiple indicators, times, situations) if they are meant to capture personality traits!!!
1.3.4 Biodata (Life Outcome Data)
Life outcome data: records of a personâs life relevant to an individualâs personality.
e.g., phone bills, speeding tickets, grade point average, sales records, diplomas, income⊠and death.
Objective behavioral indicators.
Cons: Not clear what information is relevant or accurate as an indicator for the personality trait of interest.
2.1 The Idea of a Personality Trait
Conceptual Definition (Ashton, 2018, p. 29): âA personality trait refers to differences among individuals in a typical tendency to behave, think, or feel in some conceptually related ways, across a variety of relevant situations, and across some fairly long period of time.â
Differences Among Individuals: A personality description is a comparison with other people.
Typical Tendency to Behave, Think, or Feel: Likelihood of showing some behaviors or having some thoughts or feelings.
In Some Conceptually Related Ways: Traits are expressed by various behaviors, thoughts, and feelings that appear to have some common psychological element.
Across a Variety of Relevant Situations: Not in just one specific situation, but consistency across a variety of situations and settings that are relevant.
Over Some Fairly Long Period of Time: Relatively stable pattern that can be observed over the long run.
2.3 Do Personality Traits Exist?
Hartshorne & May (1928):
Investigated 11,000 children for the consistency in their âmoral characterâ (altruism, self-control, honesty).
Observed their behavior in a variety of situations, e.g., donation to charity, cheating on a test.
Result: Children displayed little consistency between any two behaviors (rs â .20).
Mischel (1968):
Individual differences in behavior depend on the specific situation.
Also Mischel and Peak (1982): Conscientiousness depends very strongly on the situation.
Claim: âPersonality traits are of limited value for predicting behavior.â
Failure to notice the cross-situational consistency when aggregating observations across many situations
Correlations between two sets of several behaviors are much higher (rs > .50) (Jackson & Paunonen, 1985)
Personality is reflected in overall, typical behavior as observed across many different situations
2.4 Structured Personality Inventories
Some Widely Used Personality Inventories:
The California Psychological Inventory (CPI):
Over 400 items; various psychological characteristics, âeveryday variables.â
Based on The Minnesota Multiphasic Personality Inventory (MMPI) intended to measure mental illnesses.
The Eysenck Personality Questionnaire (EPQ):
Three basic dimensions of personality.
Biological basis of personality.
The Temperament and Character Inventory (TCI):
Developed by Cloninger and colleagues.
Basic biological dimensions of temperament and additional character dimensions.
The Myers-Briggs Type Indicator:
Very popular in business and assessment center settings.
Cons:
Very crude measure: assigns people to 1 of 16 personality types instead of providing personality scores.
Not a scientifically sound instrument in theory and methods.
Very limited reliability and validity â if any.
Big Five Framework: 5 major dimensions:
Neuroticism
Extraversion
Openness to Experience
Agreeableness
Conscientiousness
The Big Five Inventory (BFI): 44 items
The NEO Five-Factor Inventory (NEO-FFI) and the NEO Personality Inventory Revised (NEO-PI-R): 60 and 240 items!
The HEXACO Personality Inventory Revised (HEXACO-PI-R):
Three versions: 200, 100, or 60 items
6 dimensions:
Honesty-Humility
Emotionality
eXtraversion
Agreeableness (vs Anger)
Conscientiousness
Openness to Experience
2.5 Strategies of Personality Inventory Construction
2.5.1 The Empirical Strategy
Collect a large pool of items that show empirical relationships with the trait the researcher is interested in (e.g., femininityâmasculinity, âI like to eat red meatâ [R]).
2.5.2 The Factor Analytic Strategy
Collect a large pool of items, subject them to factor analyses, and find âgroupsâ of items that measure different traits (cf. the lexical approach that gave us the âBig 5â).
2.5.3 The Rational Strategy
Write items specifically for the purpose of assessing each traitâbased on how the researcher, theory, and research conceptualize the trait (e.g., the Multidimensional Perfectionism Scale: self-oriented, socially prescribed & other-oriented perfectionism; Hewitt & Flett, 1991).
2.6 Self- & Observer Reports on Personality Inventory Scales
Combined Use of Self- & Observer Reports
Obtain self-reports from a sample of âtargetâ persons as well as observer reports about the same âtargetâ persons from others
High agreement between self- & observer reports provides support for the construct validity of scale
NEO-PI-R: correlations of about .60 (with spouses as observers) and .40 (with friends or neighbors as observers)
HEXACO-PI-R: correlations from .40 to .60 in a sample of over 600 college students (Lee & Ashton, 2013)
Convergent validity of the scales
TEST: compare observer reports from multiple, unacquainted observers from different contexts (Funder et al., 1995)
Kolar et al. (1996)
âPeople know themselves better than anyone else knows themâ versus âOthers know us better than we know ourselvesâ
Both self- and observer reports showed validity for predicting behavior
Single observer reports were slightly better
Accuracy increased when averaging across observers
Vazire (2010); Vazire and Carlson (2011)
Gaps in our self-knowledge
Blind spots due to lack or overload of information
Biases in self-perception
âOthers sometimes know us better than we know ourselvesâ
Accuracy depends on which types of traits are considered
Self- and observer-reports capture different aspects of personality.
Self-Other Knowledge Asymmetry model (SOKA model)
SOKA Model (Vazire, 2010)
Observability
âInternalâ traits: low observability primarily thoughts and feelings e.g., anxious, self-esteem
âExternalâ traits: high observability primarily overt behavior e.g., charming, talkative
Evaluativeness
Highly evaluative traits: more biases in self-reports e.g., intelligent, rude
Self- & observer reports show fairly high levels of agreement
People provide fairly accurate descriptions of their own and othersâ personalities
Self- & observer reports can predict behavior with moderate levels of validity
LIMITATION of self- & observer reports: BIASES
Socially desirable responses and socially undesirable responses in both self- & observer reports
BUT: the more sources of information, the less bias
Psychological Measurement: Aims to make meaningful comparisons among people and calculate statistics.
Measures personality by ensuring meaningful characteristics are measured and that the measured characteristic is the one intended to be measured.
Levels of Measurement:
Nominal: Data categorized.
Ordinal: Data ranked.
Interval: Data ranked and evenly spaced.
Ratio: Data ranked, evenly spaced, with a natural zero.
Personality scales are usually treated as interval level in statistical analyses.
Standard Scores: Convert raw scores to standard scores for meaningful comparisons.
Correlation Coefficients (r): Measure the strength and direction of the relationship between two variables.
Large: .40 or larger.
Moderate: Between .20 and .40.
Small: Between -.20 and .20.
No correlation: r â .00.
Sample Representativeness & Size:
Samples should represent the population.
Larger samples provide greater statistical power.
Reliability: Consistency of a measure.
Internal-Consistency: Items of a measure correlate with each other (Cronbachâs alpha â„ .70).
Interrater: Consistency between different raters/observers.
Test-Retest: Consistency between scores across different measurement occasions.
Validity: The extent a test measures what it claims to measure.
Content: Measures all relevant features of a construct.
Construct: Measures the intended construct.
Convergent: Corresponds with similar measures.
Discriminant: Doesn't correspond with unrelated measures.
Criterion: Relations with relevant outcome variables.
Methods of Measurement:
Self-Reports: Questionnaires where individuals answer questions about themselves.
Pros: Efficient, low cost.
Cons: Can be faked or distorted.
Observer Reports: Others provide information about the target person.
Pros: More objective.
Cons: Limited observation range.
Direct Observations: Observing a personâs behavior directly.
Cons: Time-consuming, expensive.
Biodata: Using life outcome data as indicators (e.g., records, tickets).
Cons: Relevance may not be clear.
Personality Trait: Differences among individuals in typical behavior, thoughts, or feelings across situations and time.
Do Personality Traits Exist?:
Hartshorne & May (1928): Investigated consistency in children's moral character. Found little consistency between behaviors (rs â .20).
Mischel (1968): Behavior depends on the specific situation.
Jackson & Paunonen (1985): Correlations between sets of behaviors are higher when aggregating observations (rs > .50).
Structured Personality Inventories:
CPI: Measures various psychological characteristics.
EPQ: Three basic dimensions of personality.
TCI: Biological and character dimensions.
Myers-Briggs: Assigns people to 1 of 16 types (limited reliability and validity).
Big Five Framework: Neuroticism, Extraversion, Openness, Agreeableness, Conscientiousness.
HEXACO: Honesty-Humility, Emotionality, Extraversion, Agreeableness, Conscientiousness, Openness.
Strategies of Personality Inventory Construction:
Empirical: Collect items with empirical relationships to the trait.
Factor Analytic: Use factor analyses to find groups of items measuring different traits.
Rational: Write items based on theory and research.
Self- & Observer Reports:
High agreement supports construct validity.
Kolar et al. (1996): Both self- and observer reports predict behavior.
Vazire (2010): Self-knowledge gaps; accuracy depends on traits.
SOKA Model (Vazire, 2010):
Observability: Internal vs. external traits.
Evaluativeness: Biases in self-reports.
Limitations: Biases in self- and observer reports, but more sources of information reduce bias.
Exam questions; these may include scenarios to evaluate understanding of internal versus external traits, examining how bias in self-reports affects the accuracy of personality assessments, and discussing the importance of utilizing multiple sources of information to mitigate these biases.
What makes for a good scale?
A good scale is both reliable and valid. It consistently produces similar results (reliability) and accurately measures what it's intended to measure (validity).
Reliability
-Internal-Consistency Reliability: The extent to which the items of a measure are correlated with one other.
-Interrater (Interobserver) Reliability: The extent of consistency between the scores of different raters/observers.
-TestâRetest Reliability: The extent of consistency between scores across different measurement occasions.
Validity
-Content Validity: The extent to which a measure assesses all relevant features of the construct and does not assess irrelevant features.
-Construct Validity: The measure assesses the same construct that it is intended to assess.
-Criterion Validity: Relations with relevant outcome variables; also called predictive validity.
Reverse-Coding
-Reverse-coding involves including items that are worded in the opposite direction of the construct being measured. This is done to balance out the tendency to agree or disagree with statements (acquiescence).
What is a self-report scale?
A self-report scale is a method of measurement where individuals answer questions about themselves to assess their personality, behaviors, thoughts, or feelings.
What is meant by âobserver reportâ?
An observer report is a method of measurement where someone else provides information about the target person. This observer should know the target person fairly well.
Under what conditions would each measurement approach be more valid?
-Self-Reports: More valid when assessing internal states, feelings, and attitudes that are not easily observable by others.
-Observer Reports: More valid when assessing external behaviors and traits that are easily observable. They can also provide a more objective perspective by reducing biases in self-perception.
-Direct Observations: Most valid when measuring specific behaviors in natural or controlled settings, especially when the behavior can be quantified.
-**Biod