1/70
Flashcards for reviewing key concepts related to statistical significance, hypothesis testing, and ANOVA.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is the purpose of calculating a p-value in hypothesis testing?
To determine the probability of obtaining an effect as big as the observed effect if the null hypothesis (H0) is true.
What does the American Statistical Association (2016) state about p-values?
P-values do not measure the probability of obtaining results by chance or the likelihood of a specific hypothesis being true.
What are Type I and Type II errors?
Type I error occurs when a true null hypothesis is rejected, while Type II error occurs when a false null hypothesis is not rejected.
What does statistical power represent?
The probability of finding an effect assuming that it genuinely exists in the population.
How is power calculated?
Power is calculated as 1-β, where β is the probability of not finding the effect.
What is Cohen's d?
An effect size measure used for t-tests to quantify the magnitude of an effect.
What is the rule of thumb regarding sample size and effect size?
More participants generally lead to more 'signal' and less 'noise'; larger effect sizes require fewer participants to detect a 'real' effect.
What is an alpha level?
The probability of obtaining a Type I error, typically set at 0.05 or 0.01.
What does a Bonferroni correction do?
It adjusts the alpha level when conducting multiple tests to reduce the likelihood of a Type I error.
What is the difference between one-tailed and two-tailed tests?
One-tailed tests hypothesize a specific direction of difference, while two-tailed tests assess differences in both directions.
What is ANOVA?
Analysis of Variance, a statistical method used to compare means across three or more groups.
What are the assumptions of ANOVA?
Random sampling, normal distribution, equal number of participants, and equal variance for each condition.
What does the F-ratio in ANOVA represent?
The ratio of variance explained by the experiment to the variance that is unexplained.
What are degrees of freedom in the context of ANOVA?
The number of independent values that can vary in the analysis, specifically for between-group and residual variances.
What is a post-hoc test in ANOVA?
A follow-up test conducted after ANOVA if a significant difference is found, used to determine which specific groups differ.
What are the differences between between-group and repeated-measures ANOVA?
Between-group ANOVA examines variance between different groups, while repeated-measures ANOVA looks at variance within the same subjects over different conditions.
What is the purpose of a chi-square test?
To assess how often an observation falls into a category compared to what would be expected by chance.
What assumptions must be met for a chi-square test?
Independence of observations and expected frequencies greater than 1, with no more than 20% of expected counts less than 5.
What is the null hypothesis for a chi-square test?
That there is no association between the categorical variables being assessed.
What are non-parametric tests?
Statistical tests that do not assume normal distribution of data and are used for data that do not meet parametric test assumptions.
What is the Mann-Whitney U test used for?
To compare two independent groups by ranking all scores together.
What does the Wilcoxon signed-rank test compare?
It compares two paired samples by assessing the differences between them and ranking these differences.
What is resampling in statistical analysis?
A method that involves repeatedly drawing samples from a dataset to perform statistical inference without relying on traditional parametric assumptions.
What is bootstrap resampling used for?
To generate confidence intervals around data estimates, such as means, by sampling with replacement from the original dataset.
What are permutation tests?
Resampling methods used to assess the significance of observed differences by randomly permuting the data.
What is the purpose of conducting multiverse analysis?
To explore numerous analyses on a dataset to determine how many produce significant results.
What is the reproducibility crisis in research?
The challenge of replicating previously published studies, which raises questions about the reliability of research findings.
What is the purpose of visualizing data?
To facilitate understanding of complex datasets, check assumptions, and effectively communicate findings to an audience.
What is Anscombe's Quartet?
A set of four datasets that have identical statistical properties but show vastly different relationships when visualized.
What is 'chartjunk'?
Excessive or misleading graphical elements that do not convey meaningful data and can confuse viewers.
When should tables be used over figures in research?
Tables are suitable for summarizing extensive information, while figures are better for identifying trends or relationships.
What is the primary consideration for good graphs in data visualization?
Graphs should be clear, well-labeled, and avoid misleading elements.
What is p-hacking?
The practice of manipulating data analysis to produce statistically significant results, often by selectively reporting only positive outcomes.
What is HARKING?
Hypothesizing after results are known, often leading to biased interpretations of data.
What is the primary goal of open science practices?
To improve transparency and replicability in research by sharing materials, data, and pre-registrations publicly.
What does the term 'publication bias' refer to?
The tendency for journals to favor the publication of significant results over non-significant findings.
What are one main limitation of conducting multiple statistical tests?
An increased likelihood of committing at least one Type I error, also known as Familywise error rate.
What is the relationship between alpha level and Type I error?
The alpha level is the threshold for determining statistical significance; a lower alpha level reduces Type I error risk.
What are the two main types of errors in hypothesis testing?
Type I error and Type II error.
Why are non-parametric tests preferred in certain situations?
They are better suited for data that violate the normality assumption and are effective for analyzing ordinal data.
What does the Kruskal-Wallis test assess?
Differences among three or more groups when data is not normally distributed.
What is the purpose of hypothesis testing?
To ascertain whether observed differences in data reflect true effects or are due to sampling error.
What do researchers aim to minimize when choosing an alpha level?
The risk of making a Type I error.
What is the role of effective data visualization in research communication?
To clarify complex findings, enhance understanding, and facilitate interpretation for audiences.
When are post-hoc tests necessary in an ANOVA?
Only when the main ANOVA test yields a significant result.
What is the difference between main effects and interactions in multi-factorial ANOVA?
Main effects refer to the direct impact of one independent variable, while interactions assess how the effect of one variable depends on the level of another.
What does SPSS stand for?
Statistical Package for the Social Sciences.
Why is it important to calculate expected frequencies in chi-square tests?
Expected frequencies are compared to observed counts to determine if there is a significant difference between categories.
What does it mean when the p-value is reported as less than 0.05?
It suggests that there is sufficient evidence to reject the null hypothesis at the 5% significance level.
What does it indicate if an effect size is larger?
A larger effect size indicates a more substantial effect observed in the data.
What is the significance of sampling with replacement in bootstrap methods?
It allows for the estimation of a sampling distribution by creating multiple resamples from the original dataset.
Why is statistical significance not always indicative of practical importance?
A statistically significant result may not have meaningful implications or relevance in real-world contexts.
What does the term 'residual variance' refer to in ANOVA?
The unexplained variance that remains after accounting for the variance explained by the model.
In hypothesis testing, what does failing to reject the null hypothesis imply?
It suggests that there is insufficient evidence to support the alternative hypothesis.
What is a mixed design in multifactorial ANOVA?
A study design that includes both within-subjects and between-subjects factors.
How do researchers typically ensure statistical results are generalizable?
By ensuring their study sample accurately represents the larger population they are studying.
What is a key challenge in interpreting findings from multi-factorial ANOVA?
The complexity of interactions can make results difficult to understand and communicate.
Why are methods like bootstrapping popular among researchers?
They provide flexible approaches for statistical inference without the stringent assumptions of traditional methods.
What does computing the F-ratio allow researchers to evaluate?
It allows comparison of variance within and between groups in ANOVA.
How can resampling techniques facilitate more robust hypothesis testing?
By simulating sampling distributions to assess the probability of observing the data under the null hypothesis.
What should be considered when interpreting the results of non-parametric tests?
Translating the rank-based analysis into meaningful context for the research question.
How does one determine which statistical test to use?
By considering the characteristics of the data, including the level of measurement and distribution.
What impact does a larger sample size generally have on statistical power?
It increases the likelihood of correctly detecting a true effect if it exists.
What is the primary goal of conducting significance tests in research?
To assess whether the observed data provides enough evidence to reject the null hypothesis.
What does variability in data affect when conducting statistical tests?
It influences the strength of evidence available to support or reject a hypothesis.
What can result from running too many statistical tests?
An increased chance of Type I errors due to multiple comparisons.
In statistics, what does 'signal' refer to?
The systematic variation or true effect present in the data.
What implication does a finding of p < .05 have on research findings?
It generally indicates that the result is statistically significant and warrants further investigation.
Why is documenting analysis decisions important in research?
To enhance transparency and allow for replication of studies in the future.
What is an appropriate action if the assumptions of a statistical test are violated?
Consider using a non-parametric test or transform the data to meet the assumptions.
What is the significance of the chi-square test in predicting categories?
It helps evaluate whether the distribution of data across categories differs from what would be expected by chance.