internal/external validity notes
Research Design and Statistics
Internal Validity
Definition: Internal validity is the degree to which data in a study reflects a true cause-effect relationship.
A study exhibits strong internal validity when the dependent variable is influenced solely by the manipulation of the independent variable, with no confounding variables present.
Example: In a treatment study, researchers can claim that the observed positive outcomes in clients are due to the treatment alone, without interference from other variables.
Threats to Internal Validity:
Various factors can undermine the internal validity of a study. Common threats include:
Instrumentation
History
Statistical regression
Maturation
Attrition
Testing
Subject selection biases
Interaction of factors (Hegde, 2003; Schiavetti et al., 2011)
Instrumentation
Definition: Instrumentation refers to issues related to measurement tools, including mechanical and electrical instruments, pencil-and-paper tools (like questionnaires and tests), and human observers.
Mechanical Instruments: For example, an improperly calibrated audiometer can negatively impact validity in studies measuring hearing loss.
Standardized Tests: Tests that are standardized on one demographic may not be valid for another; for instance, using a test normed on monolingual English-speaking white children to evaluate Vietnamese refugee children can seriously compromise internal validity.
Human Observers: Studies with observers must be cautious. Biased or inexperienced observers may yield invalid scores, particularly if they become familiar with the behavior only through the course of the study (e.g., observing glottal fry).
Control Measures: Frequent calibration of tools and adequate training for human observers are essential to maintain internal validity.
History
Definition: History encompasses external events that may affect the dependent variable after the independent variable has been introduced, potentially confounding results.
Example: A child undergoing speech therapy might also receive simultaneous treatment through the insertion of pressure-equalizing tubes, making it unclear whether improvements in speech were due to therapy or the tubes.
Long-term studies are particularly susceptible to this threat, thus employing control groups or treatment reversal can mitigate its effects.
Statistical Regression
Definition: Also called regression to the mean, statistical regression is the phenomenon where extreme measurements tend to return to average levels.
Example: Clients with voice concerns often seek treatment at their worst (e.g., significant hoarseness). Improvements measured post-treatment may be attributed to this regression, rather than the efficacy of the treatment itself.
Control groups and staggered treatments can help mitigate the effects of this regression in studies (SSD).
Maturation
Definition: Maturation refers to natural biological and psychological changes in participants over time that can influence the dependent variable.
Example: In a study observing language stimulation in kindergarteners over a year, observed improvements might partly stem from natural maturation rather than treatment effects.
As with other threats, using control groups and treatment reversals contributes to controlling maturation effects.
Attrition
Definition: Also known as mortality, attrition refers to losing participants during the course of a study, thus impacting results.
Group designs that rely on averages are particularly vulnerable to attrition, especially if dropout rates differ significantly between groups.
Example: If a study on treatment effectiveness sees more severe cases drop from the experimental group than the control group, it may falsely suggest treatment efficacy.
Attrition poses less risk in SSDs as participants can typically be replaced without statistical complications.
Testing
Definition: Testing relates to changes in the dependent variable resulting from repeated measurement.
Example: Answers to attitude questionnaires may change simply due to the nature of being measured before and after treatment, leading to misleading conclusions about the treatment's effectiveness.
To minimize reactive testing effects, direct measurements of behavior rather than self-reports or questionnaires are recommended.
Subject Selection Biases
Definition: Subject selection biases occur when subjective factors influence participant selection in a study.
It is crucial that the differences noted between experimental and control groups on posttests are solely attributable to the treatment.
Randomly selected and assigned groups can address potential biases, while SSDs do not suffer from this concern due to the absence of group comparisons.
Interaction of Factors
Definition: Results can be confounded by interacting threats described earlier.
Example: A study assessing attitudes in spouses of patients with aphasia may be influenced by both testing effects and attrition.
Interpretation can be further complicated by the interplay between participant selection biases and other variables such as maturation.
External Validity
Definition: External validity refers to the generalizability of a study's results to other populations or contexts (Hegde, 2003).
Threats hinder generalizability even if the internal validity is robust, making it crucial to acknowledge that findings from one study cannot universally apply.
Example: Results from a study on Spanish-speaking Mexican children cannot be generalized to Puerto Rican or Cuban children.
Threats to External Validity: Common threats include:
Hawthorne effect
Multiple-treatment interference
Reactive or interactive effects of pretesting
Hawthorne Effect
Definition: The Hawthorne effect refers to results being influenced by participants' awareness of being studied.
Individuals may knowingly try to respond in ways they believe are expected by researchers, potentially skewing results.
Example: Stutterers who are involved in a research project may rate their treatment satisfaction higher due to the desire to please the experimenters, ultimately questioning the validity of reported outcomes.
Multiple-Treatment Interference
Definition: This phenomenon describes how one treatment can alter the effect of another when multiple treatments are administered to the same participants.
Example: When clients who stutter engage in both counseling and a time-out technique, the resultant effect might be influenced by the sequence of therapies—a concern known as the order effect.
Generalizability may then be limited to those experiencing both treatments in that specific order, rather than either treatment alone.
Pretest and Posttest Sensitization to Treatment
Definition: When participants are sensitized by pre- and posttests, this can enhance treatment receptiveness without overtly altering behavior.
Example: Participants undergoing voice therapy might fill out a pretest questionnaire about their vocal abuse, enhancing their receptiveness to change during the treatment phase. Upon post-tests, their responses might reflect this heightened awareness rather than actual behavioral change.
As a result, findings may only be applicable to individuals exposed to similar pre- and post-measurements.
Levels of Evidence for Evidence-Based Practice
Concept: Evidence-based practice involves critically evaluating research evidence and integrating it with clinical expertise and client preferences for informed treatment decisions.
Evidence is often variable and spans a continuum.
Evidence categories in medical practice include:
Class I Evidence: Based on randomized group experimental designs (e.g., randomized clinical trials). This type of evidence is the strongest and relies on at least one large clinical trial featuring experimental and control groups.
Class II Evidence: Derived from well-designed comparative studies that do not utilize random selection or assignment. This potential lack of randomization means groups may not be equitable and results could reflect pre-existing differences rather than treatment effectiveness.