Measurement in Scientific Research

Exam Performance and Its Relation to Measurement

  • Individuals often experience discrepancies between their preparation for exams and their actual performance.

    • It's common to perform well on an exam without complete preparation if the questions align with known material.

    • Conversely, one may struggle even with diligent preparation if the exam covers unfamiliar topics.

  • These experiences highlight the complexities of measurement not limited to education but relevant to scientific research as well.

    • Ensuring accurate measurements is critical in research to avoid misleading results.

Measurement Concepts: Reliability and Validity

  • Reliability: Refers to consistency in measurements, ensuring repeatability of scores.

  • Validity: Refers to accuracy; whether the measure is assessing what it intends to measure.

Analogy of Reliability and Validity Using Archery

  • Shooting at a bull's eye analogy helps to differentiate the concepts.

    • The green center of the bull's eye symbolizes the construct being measured.

    • Closeness to the bull's eye reflects validity; variability reflects reliability.

Target 1: High Reliability, Low Validity
  • Example: Shooting five arrows that land in the upper left but far from the bull's eye.

    • Conclusion: Reliable (consistency in scores) but not valid (not measuring the intended construct).

Target 2: Low Reliability, Improved Validity
  • Shooting five arrows that are more spread out but closer to the bull's eye.

    • Result: Validity improves due to proximity to the target, but reliability decreases due to a broader spread.

Target 3: Poor Reliability and Validity
  • Shooting five arrows with only one arrow hitting the bull's eye but others missing completely.

    • Conclusion: Neither valid nor reliable measurement.

Target 4: High Reliability and Validity
  • Shooting five arrows, all hitting the bull's eye consistently.

    • Conclusion: Both valid and reliable results achieved.

Relationship Between Reliability and Validity

  • If a measure is valid, it is also reliable by definition.

  • Conversely, a measure can be reliable without being valid (e.g., Target 1).

  • Importance of ensuring both validity and reliability in research for credible conclusions.

Types of Reliability

  • Exploring three types of reliability essential for measurements.

1. Inter-rater Reliability

  • Defined as the degree of agreement among judges or raters.

    • Multiple judges rating performances need to agree for high reliability.

    • Example: Four judges score a singer as 8, 8.5, 8.5, and 8, leading to high inter-rater reliability.

    • Lack of training for judges can lead to variability in scoring.

Factors Affecting Inter-rater Reliability
  • Importance of training judges to ensure consistency in measuring various aspects of performance and avoiding bias by discussing scores.

2. Test-Retest Reliability

  • Refers to the measure's stability over time, ensuring consistent results when tested repeatedly.

    • Exemplar: Contestants singing multiple times to obtain accurate scoring of talent.

  • Good stability indicated when scores are consistent across performances (e.g., scores of 7, 7.5, and 6.5).

Implications on Constructs
  • Constructs like mood may yield poorer test-retest reliability due to natural fluctuations, while stable constructs like intelligence should correlate well over time.

3. Internal Consistency

  • Describes consistency across items within a measure that are meant to assess the same construct.

    • Internal consistency can be shown with multiple badges examining social anxiety or aggression in various ways.

  • Strong internal consistency indicated when items yield similar scores, using measures such as Cronbach's Alpha.

Assessing Internal Consistency
  • Split-half technique involves correlating the first half of the items with the second half.

  • Cronbach's Alpha provides a mean score across all possible splits, aiming for a score of 0.80 or higher for good reliability.

Improving Measurement Reliability

  • Strategies for enhancing reliability in research tools:

Inter-rater Reliability

  • Ensuring clear and comprehensive evaluation protocols for raters.

  • Conducting mock sessions for training helps refine consistency among judges.

Test Stability and Internal Consistency

  • Increasing the number of items improves reliability; more questions lessen the impact of any single response.

  • Clarity in questions aids understanding, ultimately supporting reliability across measures.

Sample Size Impact

  • Raising sample size can enhance reliability, though it must be done thoughtfully to avoid introducing bias.

  • Ideally, sample consistency is maintained through appropriate target demographics.

Validity of Measures

  • Validity stands as a critical aspect of measurement accuracy, focusing on how well a measure reflects the intended variable.

Types of Validity

  • Five types generally examined:

1. Face Validity
  • The degree to which a measure appears to assess its intended construct.

    • Example: Self-esteem questionnaires including relevant items reflect face validity.

2. Content Validity
  • The completeness of coverage across the construct of interest.

    • Example: Athletic ability measures require insights into various athletic behaviors.

3. Predictive Validity
  • Evaluates correlational effectiveness between measured scores and future performance.

  • SAT scores forecasting college GPAs illustrate predictive validity.

4. Convergent Validity
  • Ensures correlation with other measures that assess similar constructs, like measuring anxiety across various methods.

5. Discriminant Validity
  • Proves low correlation with measures that are not supposed to correlate, validating the measure's specificity.

Validity Assessment Techniques

  • Similar to reliability, validity often measures correlations, commonly employing Pearson's product correlation across the different types of validity, except for face and content validity which are typically assessed qualitatively.