1/144
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What are the 5 components of an experiment?
Question
Design
Execute
Statistical Analysis
Interpret Results
Can you fix a bad design with good data analysis?
No, poor experimental design limits your ability to draw valid conclusions.
What is a mensurative experiment?
An observational study with no treatment or manipulation; only measurements are taken.
Example of a simple mensurative experiment?
Measuring leaf decomposition by weighing leaf bags before and after submersion.
What is a manipulative experiment?
An experiment that includes a treatment and a control to assess effect.
What is pseudoreplication?
Using non-independent samples as if they are replicates, inflating statistical power.
Example of spatial pseudoreplication?
Measuring multiple spots in the same pool and treating them as independent samples.
Example of temporal pseudoreplication?
Measuring the same pool every week and treating each measurement as independent.
What is sacrificial pseudoreplication?
When true replicates exist, but internal measurements are treated as replicates instead of averaging.
Key principles of manipulative experimental design?
Randomization
Replication
Interspersion of treatments
Independence of treatments
What is interspersion?
Spreading out treatments to avoid confounding spatial patterns.
What does Hurlbert (1984) say about pseudoreplication?
Half of ecological papers had some form of pseudoreplication.
What was wrong with the skink/gecko predator study?
Only one replicate per treatment and confounded with agricultural land use.
What was the main issue in the stream invertebrate restoration study?
Lack of replication – only one restored and one natural reach.
Why was the mayfly drift experiment limited?
Unrealistic conditions (no algae) and not enough mayflies used.
What is a repeated-measures design?
Same subjects measured multiple times (e.g. before and after treatment).
What is a nested design?
Hierarchical sampling (e.g., sites → riffles → stone groups → stones).
What characterizes a split-plot design?
Whole plot and subplot factors with randomization at different levels; complex ANOVA required.
What is a BACIP design?
Before-After-Control-Impact design with multiple time points after the impact.
What is MBACI(P)?
Multiple BACI sites, analyzed using repeated and nested ANOVA.
When is a gradient design used?
When treatments are continuous (e.g., farming intensity from 0–100%).
How are gradient designs analyzed?
Using regression or generalized linear models with model selection.
How do you avoid a poor experimental design?
Diagram the layout
Use true replication
Randomize treatments
Describe stats analysis
Discuss limitations if replication is impossible
What is a sample in statistics?
A collection of randomly selected, independent observations from a defined statistical population.
What is a statistical population?
The full set of possible observations of interest.
Why must samples be randomly selected?
To avoid sampling bias and ensure validity.
Why must samples be independent?
So that each observation gives unique, unbiased information about the population.
Name 6 common sample summary statistics.
Mean
Variance
Standard Deviation
Standard Error
Confidence Limits (Intervals)
Median (plus quartiles)
What is the formula for variance?
(∑(yi−yˉ)2)/(n−1)
What is standard deviation?
The square root of variance; shows spread of values around the mean.
What is standard error?
SE = SD / √n
It measures how well the sample mean estimates the population mean.
What do 95% confidence intervals tell you?
The range where the true population mean lies with 95% probability.
What assumptions do mean, SD, SE, and CI rely on?
That data are normally distributed.
What does the Central Limit Theorem state?
That with large samples, sample means are normally distributed even if the population is not.
What if ecological data aren’t normally distributed?
Use median and quartiles instead of mean and SD.
Are non-overlapping SDs equivalent to statistical tests?
No; even 95% CIs are only equivalent in very simple designs (like 1-factor ANOVA).
What is statistical robustness?
The ability of a test to yield valid results even when assumptions are slightly violated.
Is robustness a good thing for ecologists?
Yes, but assumptions still need to be checked.
What is exploratory data analysis?
Preliminary checks done before formal statistical analysis.
Why do exploratory data analysis?
Check data quality
Detect errors
Ensure assumptions are met
Identify outliers
What tools are used in exploratory data analysis?
Histograms (for normality)
Boxplots (for variance, outliers, normality)
Name 4 key parametric assumptions.
Normality
Equal variances
No outliers
Linearity (in regression/ANOVA)
What transformations help normalize data?
Log, arcsin(√x), 4th root.
When should you decide to transform data?
After EDA but before running formal tests.
Why can transformation be controversial?
It changes the scale and the null hypothesis.
How do you test for equal variances?
With Levene’s test (but be cautious – it also reacts to non-normality).
Why is equal variance more important than normality?
Tests are more robust to non-normality than unequal variances, especially when sample sizes differ.
What to do if variances increase with the mean?
Try log or 4th root transformations.
How to deal with outliers ethically?
Double-check for errors
Use a priori criteria
Compare analysis with and without them
Avoid removing them just to get significance
Parametric tests are based on...?
Measured values and strong assumptions (normality, equal variance, etc.)
Non-parametric tests are based on...?
Ranks of values; fewer assumptions.
Pros and cons of parametric tests?
Pros: More powerful, more test options
Cons: More assumptions
Pros and cons of non-parametric tests?
Pros: Fewer assumptions, more robust
Cons: Less powerful, fewer tests, lose detail
What are GLMs good for?
Handling non-normal data using specific link functions (e.g. Poisson, binomial).
Are GLMs a cure-all?
No — ecological data can be too irregular, still may need transformations.
Are zeros missing values?
No! Zeros are valid observations.
Why are missing values problematic?
They cause unequal sample sizes, reducing test robustness.
What are (bad) ways people deal with missing data?
Deleting other observations to balance
Substituting values (not recommended)
What is the null hypothesis (H₀)?
Usually that there is no difference between population variables (e.g., means).
What is the alternative hypothesis (Hₐ)?
Must be true if H₀ is rejected; not formally tested but assumed true by default.
What are the 6 steps of hypothesis testing?
Specify H₀, Hₐ, and test statistic
Choose significance level (usually α = 0.05)
Collect data and calculate test statistic
Compare test stat to its null distribution
If p < α → reject H₀ (significant)
If p ≥ α → fail to reject H₀ (non-significant)
What are degrees of freedom (df)?
Number of values that are free to vary; df = n - 1
What is a Type I error?
False positive – rejecting H₀ when it's actually true.
What is a Type II error?
False negative – failing to detect an effect that actually exists.
What is β (beta)?
The probability of making a Type II error.
What is statistical power?
Probability of detecting a true effect (Power = 1 - β); should be > 0.80.
Name 5 factors that influence statistical power
Effect size
Sample size
Variance
Significance level (α)
Statistical test used
What is effect size?
The magnitude of the difference or effect you're trying to detect.
Why is power analysis done a priori?
To determine the sample size needed to detect a meaningful effect before data collection.
Why is post hoc power analysis done?
To explain why non-significant results occurred, based on known sample size and variance.
What does traditional hypothesis testing focus on?
Testing H₀ using p-values; Hₐ supported by default if p is small.
What do information-theoretic (AIC) or Bayesian approaches focus on?
Comparing multiple models (hypotheses) and estimating strength of evidence for each.
When is correlation used?
To test relationships between two continuous variables (no predictor/response distinction).
What is the Pearson correlation coefficient (rₚ)?
Measures strength/direction of a linear relationship (parametric).
What are some possible values for rₚ?
+1: perfect positive
0: no relationship
-1: perfect negative
What is the non-parametric equivalent of Pearson correlation?
Spearman rank correlation (rₛ)
When is simple linear regression used?
When one variable predicts another; defines a line using the least squares method.
What are residuals in regression?
The differences between observed and predicted y-values.
What are 3 aims of regression?
Test linear relationship
Quantify variation in y explained by x
Predict new y-values (less common in ecology)
What is R² and how is it interpreted?
Proportion of variation in y explained by x.
<0.10: trivial
0.10: weak
0.30: moderate
0.50: strong
What assumptions does regression make?
Normality
Equal variances
No outliers
Independence
What is Cook’s Distance?
Measures influence of a point based on leverage and residuals.
When is multiple linear regression used?
When there are two or more continuous predictor variables.
What is Partial Eta Squared?
% of variation in y uniquely explained by a predictor, controlling for other predictors.
What is collinearity and why is it a problem?
When predictor variables are highly correlated — inflates variance, affects estimates.
How to test for collinearity?
Tolerance > 0.1
VIF < 10
How to fix collinearity?
Drop one predictor
Center variables (subtract their means)
What are interaction terms in regression?
Combine predictors to see if their joint effect differs from individual effects (e.g., X₁ * X₂).
When is polynomial regression used?
When the relationship between x and y is non-linear.
What does polynomial regression often involve?
Adding terms like X² or X³ to the model.
What is a Chi-Square test used for?
Testing frequencies of categorical variables against expected distributions.
What are contingency tables?
Tables showing frequencies for combinations of two or more categorical variables.
Assumptions of Chi-Square tests?
No more than 20% of expected frequencies < 5
Observations are independent
Why are contingency tables limited?
Can’t test interactions — better to use a GLM (e.g., Poisson regression) or ANOVA on percentages.
When is One-Way ANOVA used?
When testing the effect of 1 categorical predictor (2+ groups) on 1 continuous response variable.
What are the two goals of One-Way ANOVA?
Assess how much variation is explained by group differences.
Test whether group means are equal (H₀: all means equal).
What does the ANOVA table partition?
Total sum of squares (SStotal) = explained + residual variation
Between-group variation: df = number of groups - 1
Within-group (residual) variation: unexplained
What affects statistical power in One-Way ANOVA?
Increases with total sample size
Decreases as the number of groups increases (if sample size is fixed)
More replicates per group = more power
What are Post Hoc Tests (e.g., Tukey HSD) used for?
To determine which group means differ after a significant overall ANOVA.
What are the assumptions of One-Way ANOVA?
Normality
Homogeneity of variances (very important!)
No outliers
Independence