book 3
Reliability and Statistical Power
Definition of Statistical Power: The likelihood of detecting a difference between experimental groups given a certain sample size.
Impact of Reliability on Statistical Power:
More reliable scales increase statistical power for a given sample size.
Improved reliability allows a smaller sample size to yield equivalent power compared to less reliable measures.
Sample Size Considerations:
A specified degree of confidence is needed to detect differences in experimental groups.
Increased sample size generally increases power.
Reliability-enhancing measures and larger samples both reduce error, enhancing power.
Factors Affecting Power Gains from Reliability:
Initial sample size.
Probability level for detecting a Type I error.
Effect size considered significant (e.g., mean difference).
Proportion of error variance attributed to unreliability versus other sources.
Example:
Setting a Type I error probability to .01, requiring a 10-point effect size, and an error variance of 100:
Sample size increases from 128 to 172 for power to increase from .80 to .90.
Reducing error variance from 100 to 75 yields the same power increase without adjusting sample size.
Correlation and Scale Reliability:
For a sample size of 50, two scales with reliabilities of .38 at a correlation of r = .24 yield significance at p < .10. Increasing reliability to .90 achieves significance at p < .01, while keeping it at .38 would require doubling the sample size.
Enhancing Reliability:
Increases in reliability can be achieved by increasing the number of items or the correlation between them.
More items or better item quality can enhance power similarly to larger sample sizes.
Administering Items to a Development Sample
Sample Size Guidance:
The consensus on what constitutes a "large" sample is vague, but Nunnally suggests 300 subjects.
Development of reliable scales can be achieved with fewer subjects depending on the number of items and scales being extracted.
Pitfalls of Small Sample Size:
Patterns of covariation among items may lack stability, leading to misleading assessments of internal consistency (alpha).
Small samples increase the likelihood of chance influencing item correlations, potentially excluding good items based purely on noise.
Risks of Nonrepresentativeness:
The development sample could fail to represent the target population, potentially skewing results.
Nonrepresentativeness can occur through:
Quantitative Nonrepresentativeness: Narrow range of attribute presence in the sample versus the larger population.
Qualitative Nonrepresentativeness: Different item meanings for the sample compared to the intended population (e.g., cultural differences impacting interpretation).
Consequences of Nonrepresentativeness:
A narrow range does not disqualify a sample but can yield imprecise scale means.
Qualitative differences in responses can misrepresent the underlying structure necessary for reliable scales.
Language and terminology must match the population's understanding for valid responses.
Focus Groups and Cognitive Interviews: Tools to discern how concepts are understood by participants, ensuring items are comprehensible and valid.
Conducting the Survey
Survey Method Options:
In-person interviews, telephone surveys, mail surveys, and Internet surveys.
In-person interviews yield highest response rates and better personal contact.
Telephone surveys are less expensive than in-person but still expensive with declining landline samples.
Mail surveys are low-cost but prone to nonresponse bias; Internet surveys are rising in popularity due to low cost.
Internet Survey Methods:
Can utilize links sent via e-mail, posted on known websites, or initial contact through mail.
Challenges with demographic representation persist as Internet utilization varies.
Myths about Internet Findings:
Internet samples can be diverse, and findings often align with traditional methods, dispelling myths about lower quality data.
Key Takeaways regarding Survey Research
Probability Sampling: Ensures each member has a known chance of selection; includes simple random, stratified random, and cluster sampling.
Sampling Bias: Occurs when selected samples misrepresent the population, most notably through nonresponse bias.
Minimizing Bias: Achieved by maximizing response rates through pre-notification, reminders, clear questionnaires, and incentives.
Survey Types: Can be conducted in various ways, each with respective advantages and limitations.
Measurement in the Broader Research Context
Scale Development Consideration:
Check for existing measurement tools before developing a new scale.
Use resources such as Mental Measurements Yearbook and Tests in Print for locating instruments.
Benefits of the Internet: Expands access to existing measurement instruments and information.
Examples of initiatives: PROMIS for health outcome assessments.
Evaluating Measurement Sources: Scrutiny of Internet-based findings and instruments is essential for reliability and validity.
Pre-Scale Development Steps
Focus Groups: Gather insights on how constructs are perceived; understand everyday vernacular for better tool construction.
Cognitive Interviewing: Evaluate how items are comprehended, clearing up confusion surrounding terms or item structure.
Mode of Administration: Choice affects data collection integrity; scales should ideally match their development mode for consistency.
Response Styles and Nonconstructive Variances: Must be considered in surveys to prevent skewed findings or misinterpretations.
Correlational Research
Definition: Nonexperimental research where two variables are measured but neither is manipulated.
When to Choose Correlational Research:
When no causal relationship is believed or manipulation is not feasible or ethical (e.g., measuring daily hassles without manipulating them).
Characteristics of Correlational Research:
Can include quantitative and categorical variables; distinction lies in whether manipulation occurs.
Examples of correlational research abound in examining relationships across various contexts.