validity

VALIDITY DEFINITIONS AND TYPES

  • Validity: Defined as “the degree to which a procedure or assessment [does X, Y, or Z].”

    • Types of validity include:

    • Internal Validity: Concerns the extent to which a study establishes a trustworthy cause-and-effect relationship.

    • External Validity: Refers to how applicable the findings of a study are in the real world, also known as generalizability.

    • Construct Validity: Refers to the degree to which a test measures what it claims to measure.

    • Statistical Validity: Pertains to the appropriateness of the statistical analyses used and the conclusions drawn.

  • Two Subjective Ways to Assess Validity:

    • Face Validity: The degree to which a procedure or assessment appears effective in terms of its stated aims.

    • Content Validity: The degree to which a procedure or assessment represents all facets of a given construct.

  • Three Empirical Ways to Assess Validity:

    • Reliability: Refers to the consistency of scores or responses over time.

      • Types of Reliability:

        • Test-Retest Reliability: Achieved when people receive consistent scores upon retaking a test.

        • Internal Consistency Reliability: Indicates consistent responses on every item of a questionnaire.

        • Interrater Reliability: Consistency among raters when evaluating the same set of subjects.

    • Criterion Validity: The measure is correlated with a relevant behavioral outcome.

      • Predictive Validity: The degree to which a procedure or assessment predicts a future outcome.

      • Concurrent Validity: The degree to which a procedure or assessment aligns with another measure given at the same time.

    • Convergent Validity: The self-report measure is more strongly associated with self-report measures of similar constructs.

    • Discriminant Validity: The self-report measure is less strongly associated with self-report measures of dissimilar constructs.

FACE VALIDITY

  • Definition: Face validity refers to the degree to which a procedure or assessment appears to be effective in terms of its stated aims. It is a subjective and informal measure, serving as a good starting point but not comprehensive on its own.

  • Criticism: Low face validity can indicate inadequate measurement, while high face validity suggests that the assessment meets its aims.

  • Example: Measuring risk for heart disease using a Body Mass Index (BMI) assessment may differ in perceived validity, giving insights into the measure’s perceived truth in its evaluation.

CONTENT VALIDITY (LOGICAL VALIDITY)

  • Definition: The degree to which a measure covers all facets expected to be included based on the theoretical construct.

  • Essential Questions: Does your measure logically cover everything it should?

  • Examples for Workplace Satisfaction Scale:

    1. On a scale of 1-10, how much do you like your job?

    2. On a scale of 1-10, how likely are you to leave your job in the next 6 months?

    3. Relative to other jobs you have held, is the management here better, worse, or about the same?

    • Possible facets to measure work satisfaction can include quality training, appropriate pay, collegiality, workplace safety, management style, commute, and benefits.

CRITERION VALIDITY

  • Definition: The degree to which a procedure or assessment performs against a separate set of criteria.

  • Types:

    • Predictive Criterion Validity: The degree to which an assessment predicts a future outcome.

    • Concurrent Criterion Validity: The degree to which an assessment aligns with another measure at the same time.

VALIDITY EXAMPLES

  • Predictive Validity Examples:

    • Whether SAT scores are indicative of future college success.

    • Study by Teramoto et al. (2018) examining physical attributes' correlation with defensive performance in NBA, showing significant positive correlations (r = 0.313-0.545).

    • NFL Combine's low predictive validity for gameday performance with the exception of sprint tests for running backs.

  • Concurrent Validity Examples:

    • Leadership Aptitude Test (LAT), where results correlate with supervisors' assessments of leadership abilities.

    • Self-Esteem tests measuring student ratings against teacher judgments, establishing the validity of new tests if found to correlate.

    • Nursing Competence Assessment (NCA), where the new tool’s results correlate with supervisor evaluations but possess low predictive validity.

CONVERGENT AND DISCRIMINANT VALIDITY

  • Convergent Validity: The extent to which different measures of the same construct correlate with each other, affirming their relationship.

  • Discriminant Validity: The extent to which measures of different constructs do not correlate, affirming their distinction.

  • Analogy: In a military context, cooperation among unit members should exist, whereas no cooperation should occur with enemy members, highlighting the essence of both types of validity in assessments.

  • Importance: To establish good construct validity, both convergent and discriminant validity are required.

CONSTRUCT VALIDITY MEASUREMENT

  • Measurement Method: Similar to reliability; involves correlations to assess validity.

  • Subjective Validity: Face and content validities should be evaluated informally, while predictive, concurrent, convergent, and discriminant validities require statistical evaluation.

RELIABILITY COEFFICIENTS

  • Interpretation of Correlation Coefficients:

    • A value of $0.00$ indicates no relationship.

    • Values $0.01$ to $0.24$ suggest weak relationships.

    • Values $0.25$ to $0.49$ suggest moderate relationships.

    • Values $0.50$ to $0.74$ indicate strong relationships.

    • Values $0.75$ to $0.99$ show very strong relationships, while $1.00$ is perfect reliability.

  • Common Standards:

    • Interrater reliability: $0.85$

    • Discriminant reliability: > $0.30$

    • Strong correlations: > $0.7$

    • Weak correlations: < $0.50$

SCALES OF MEASUREMENT

  • Qualitative Variables:

    • Nominal: Categorical, e.g., color, gender.

  • Quantitative Variables:

    • Ordinal: Rank order matters; example: placing in a competition.

    • Interval: Both rank order and intervals matter; examples include temperature in Celsius/Fahrenheit.

    • Ratio: Includes all features with a true zero; examples include the number of vehicles owned.

SUMMARY OF VALIDITY TYPES AND IMPLICATIONS

  • Internal Validity: Ensures strong cause-and-effect relationship without confounding.

  • External Validity: Ensures results generalizable in real-world settings.

  • Assessing Validity: Utilizing reliability measures supplemented with correlation assessments ensures integrity in research measurements, thereby enhancing the validity of outcomes and applications.