Looks like no one added any tags here yet for you.
Statistics as Social Constructions
Subjective choices in defining variables, measurement, and sampling shape statistical findings.
Critical Evaluation of Statistics
Involves identifying issues like bad statistics, mutant statistics, soft statistics, and the concept of the dark figure.
Conceptual Definitions
Abstract meanings defining variables in research.
Operational Definitions
Measurable specifications for defining variables in research.
Sampling Bias
Systematic differences between a sample and the population.
Selection Bias
Occurs when certain individuals are more likely to be included in a sample.
Volunteer Bias
Participants who volunteer differ from those who do not volunteer for a study.
Convenience Sampling Bias
A limitation in diversity due to easy access in selecting participants.
Undercoverage Bias
When certain groups are underrepresented in the sample.
Nonresponse Bias
Occurs when participants who decline to respond differ systematically from those who participate.
Survivorship Bias
Only the outcomes of 'survivors' are analyzed, overlooking non-survivors.
Healthy User Bias
Participants in research are generally healthier than the general population.
Recall Bias
Inaccuracies in data stemming from participants' memories.
Sampling Error
Random differences between a sample statistic and the true population parameter that reduce with larger sample sizes.
Measurement Scales
Types include nominal, ordinal, interval, and ratio scales.
Categorical vs. Continuous Data
Categorical data (nominal, sometimes ordinal) vs. continuous data (sometimes ordinal, interval, ratio); determines choice of summary statistics and visualizations.
Describing Categorical Data
Use frequency distributions and bar charts; the mode is used as a measure of central tendency.
Describing Continuous Data
Use histograms or density plots to assess the shape of the distribution.
Symmetric Distributions
Distributions where mean = median = mode, typically following a normal or uniform distribution.
Asymmetric Distributions
Distributions characterized by skewness with mean differing from the median.
z Scores
Standardized scores that allow for meaningful comparisons between different distributions.
Normal Distribution
A symmetric, bell-shaped distribution where most scores cluster around the mean.
p-value
The probability of obtaining a result at least as extreme as the observed data if the null hypothesis is true.
Type I Error
Rejecting the null hypothesis when it is actually true.
Type II Error
Retaining the null hypothesis when it is actually false.
Effect Size
A measure of the magnitude of an observed difference or relationship in research.
Cohen's d
A standardized measure of effect size indicating mean differences expressed in standard deviation units.
Statistical Power
The probability of correctly rejecting a false null hypothesis.
Confidence Intervals
Quantitative ranges that estimate the precision of sample estimates for population parameters.
Replication Crisis
A situation where many published research findings fail to replicate.
Meta-Analysis
A systematic review method that quantitatively synthesizes research findings to provide objective evidence.
Describing Continuous Data
Use histograms or density plots to assess the shape of the distribution.
Symmetry, the Mean, and the Median
If the mean and median are equal, the distribution is symmetric; if they differ, the distribution is asymmetric, with the mean pulled toward the tail.
Standard Scores
Transform raw scores into a common scale; z scores are the most widely used standard scores.
Estimating vs. Calculating Percentiles
Visual inspection of graphs helps estimate percentiles before using the unit normal table.
Statistical Testing
Moves beyond describing data to evaluating whether observed patterns reflect true effects or random variation.
Null and Alternative Hypotheses
The null hypothesis (H0) assumes no effect, while the alternative hypothesis (H1) suggests a true effect or difference.
Sampling Distributions
Describe the expected variation in sample statistics under the null hypothesis, forming the basis for statistical decision-making.
Alpha Level (α)
The pre-set probability threshold (typically 0.05) for defining statistical significance.
One-Tailed vs. Two-Tailed Tests
One-tailed tests allocate the entire alpha level to one extreme of the sampling distribution, while two-tailed tests split it across both extremes.
Critical Value and Critical Region
The critical value marks the boundary for significance; the critical region consists of results so extreme they would occur with probability less than α under H0.
Confidence Intervals as Precision Estimates
Confidence intervals (CIs) quantify the precision of sample estimates by providing a range of plausible values for a population parameter.
Challenges to Reproducibility
Many published research findings fail to replicate, undermining trust in scientific conclusions.
Meta-Analysis as a Systematic Review Method
A quantitative approach that synthesizes research findings to provide a more objective and reproducible summary of evidence.
Effect Size for Associations
Measures the strength of relationships rather than group differences.
Rules of Thumb for Effect Sizes
Guidelines for interpreting effect sizes vary by context.
Contextual Interpretation of Effect Sizes
Even small effects can be meaningful when outcomes are significant.
Statistical Power
The probability of correctly rejecting a false null hypothesis.
Factors That Influence Statistical Power
Three main factors determine power: sample size, effect size,
Statistical Testing
Moves beyond describing data to evaluating whether observed patterns reflect true effects or random variation.
Modeling Chance
Establishes expectations for data under the assumption that no real effect exists; comparisons to these expectations determine significance.
Null and Alternative Hypotheses
The null hypothesis (H0) assumes no effect, while the alternative hypothesis (H1) suggests a true effect or difference.
Sampling Error
Random variability causes sample statistics to differ from population parameters; statistical tests account for this variability when assessing significance.
Sampling Distributions
Describe the expected variation in sample statistics under the null hypothesis, forming the basis for statistical decision-making.
Alpha Level (α)
The pre-set probability threshold (typically 0.05) for defining statistical significance.
One-Tailed vs. Two-Tailed Tests
One-tailed tests allocate the entire alpha level to one extreme of the sampling distribution, while two-tailed tests split it across both extremes; two-tailed tests are standard to avoid bias.
Critical Value and Critical Region
The critical value marks the boundary for significance; the critical region consists of sample results so extreme that they would occur with probability less than α under H0, leading to its rejection.
Interpreting Results
If the test statistic falls inside the critical region, reject H0; the result is statistically significant. If outside, retain H0; the result is not statistically significant.
Statistical vs. Practical Significance
Statistical significance indicates whether results are unlikely due to chance, but does not address whether they are meaningful in practical terms.
Practical Significance
A statistically significant result may lack practical importance due to flawed study design or small effect size.
Effect Size
Measures the magnitude of an observed difference or relationship, independent of sample size.
Why Effect Size Matters
Helps compare findings across studies, interpret unfamiliar metrics, and assess the impact of research results.
Standardized Effect Sizes for Mean Comparisons
Cohen’s d expresses mean differences in standard deviation units, and η² indicates how much variability in the dependent variable is accounted for by group differences.
Effect Sizes for Associations
Measures the strength of relationships rather than group differences; includes Pearson’s correlation coefficient and coefficient of determination.
Rules of Thumb for Effect Sizes
Guidelines for interpreting effect sizes vary by context; common benchmarks include Cohen’s d and r values.
Contextual Interpretation of Effect Sizes
Even small effects can be meaningful; costs and benefits, accumulation over time, and generality of effects must be considered.
Statistical Power
The probability of correctly rejecting a false null hypothesis; higher power allows for greater detection of true effects.
Typical Power Levels in Research
Studies often have insufficient power; small effect power is ~0.23, medium effect ~0.62, large effect ~0.84.
Factors That Influence Statistical Power
Sample size, effect size, and decision threshold (α level) determine power.
Power Analysis
Conducted during study planning to ensure adequate power and determine sample size needed.
How to Maximize Power
Increase sample size, collect more data, use within-subjects designs, measure variables precisely, and avoid dichotomizing continuous variables
Challenges to Reproducibility
Many published research findings fail to replicate, undermining trust in scientific conclusions.
False Findings in Research
John Ioannidis argued that most published findings are false due to high false-positive rates, small sample sizes, flexibility in study designs and analyses, conflicts of interest, and competitive research environments.
Replication Crisis
Replication is a cornerstone of science, but replications are rare due to a focus on novelty in academic publishing. Large-scale replication efforts in psychology found that fewer than half of published findings were successfully replicated.
Questionable Research Practices (QRPs)
Researchers engage in flexible analysis and reporting strategies that can increase false positives.
Researcher Degrees of Freedom
The flexibility researchers have in designing studies, analyzing data, and reporting results can lead to inflated false-positive rates.
p Hacking
Conducting multiple analyses and only reporting those that produce significant results.
HARKing (Hypothesizing After the Results are Known)
Presenting post hoc explanations as if they were planned in advance.
Selective Reporting
Failing to report all experimental conditions, variables, or analyses, leading to biased literature.
Bias in Peer Review
Peer review has systemic flaws that can undermine its role as a quality control mechanism.
Volunteer Nature of Reviewing
Reviewers are unpaid, leading to variable effort and care in evaluations.
Anonymity and Accountability in Peer Review
Anonymous reviews encourage honesty but reduce accountability and recognition, leading to inconsistent diligence.
Reviewer Errors
Careful reviewers may fail to detect undisclosed multiple statistical tests or subtle questionable research practices.
Shared and Idiosyncratic Biases
Reviewers may favor research aligned with their own views or overlook methodological flaws in studies that support a preferred narrative.
Resistance to Criticizing Common Practices
Reviewers may avoid critiquing methods they use, such as convenience sampling or reliance on online participant pools.
Strategies to Improve Reproducibility
Encouraging replication studies, pre-registration, registered reports, open science practices, and comprehensive reporting to verify research results.
Replication Studies
Encouraging direct and conceptual replications to verify results.
Pre-Registration
Publicly posting hypotheses, methods, and analysis plans before data collection to prevent p hacking and HARKing.
Registered Reports
Journals accept studies for publication based on methodological quality before results are known, reducing publication bias.
Open Science Practices
Making data, analysis code, and materials publicly available for verification and reanalysis.
Comprehensive Reporting
Requiring full disclosure of all analyses, experimental conditions, and results to ensure transparency.
Limitations of Narrative Reviews
Traditional literature reviews rely on subjective impressions, which can lead to bias and inconsistency.
Qualitative Impressions vs. Statistical Aggregation
Reviewers form qualitative impressions rather than aggregating data statistically.
Variability in Primary Studies
Small primary studies have high variability, making it hard to detect true effects.
Statistical Tools for Interpretation
Differences in study findings are difficult to interpret without statistical tools.
Moderator Analysis
Investigates factors that influence effect sizes across studies, such as differences in study design, participant characteristics, or measurement techniques.
Publication Bias
Studies with significant results are more likely to be published, skewing the literature.
Funnel Plots
A visual diagnostic tool for detecting asymmetry in study distribution, which may indicate missing studies.
Trim-and-Fill Method
A statistical adjustment to estimate the true effect size in the presence of publication bias.
The Eight Core Elements of a Meta-Analysis
Include 1) Clearly defined research question, 2) Systematic literature search, 3) Effect size extraction, 4) Weighting of studies, 5) Computation of summary effect size, 6) Assessment of heterogeneity, 7) Moderator analysis, 8) Evaluation of publication bias.