BILD 5 FINAL EXAM

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/37

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 7:54 PM on 3/13/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

38 Terms

1
New cards

P-Value vs Effect Size

P-values indicate whether a difference is statistically significant, but don’t describe how large that difference is

  • don’t capture whether difference is biologically meaningful

Effect size: quantity that captures magnitude of the effect

  • significant result could have tiny effect size

  • non-significant result could have large effect size

2
New cards

Type I vs Type II errors

Type I: false positive, erroneously conclude there is a difference when in reality, none exists

  • reject the null when null is actually true

Type II: false negative, fail to detect a difference when one does exist

  • fail to reject the null when null is false

3
New cards

Alpha and Type I error

The significance level (alpha) sets rejection fence:

  • when alpha = 0.05, we only reject the null in most extreme 5% of t-scores

  • when alpha = 0.01, we only reject null in most extreme 1% of t-scores

Lower alpha → stricter standard for rejecting the null

  • as we lower alpha, more cautious about rejecting null; reduces likelihood of type I error

Type I error rate = alpha!

  • alpha = probability of incorrectly rejecting the null

4
New cards

Trade-off between Type I and Type II error

Lowering alpha:

  • Good: more cautious about rejecting null → reduced type I error rate

  • Bad: might end up missing effect that DOES exist → increased type II error rate

With lower alpha, becomes harder to reject the null

  • less likely to reject null by mistake (reduce type I error)

  • makes it harder to reject null when you should (increase type II error)

Attempting to reduce one will increase the other, therefore error can never = 0; type I and type II errors are at odds with each other

5
New cards
6
New cards

Type II error and power + power analysis

Type II error (ß): if alternative is true, what’s the probability we miss the effect?

Power: if alternative is true, what’s the probability we detect the effect?

  • we want high power

  • study with low probability of type II error = high statistical power

    • power is opposite of type II error

Common target for statistical power is 80%:

  • if alternative is true (an effect DOES exist), we correctly detect it 80% of the time

Before conducting experiment, researchers perform power analysis:

  • before you start, you usually have some idea of expected effect size and SD

  • “what sample size do I need to achieve a desired power?”

7
New cards

Factors affecting statistical power

Sample size

  • higher the sample size, higher the power (directly proportional)

  • more likely we are to detect a difference if one exists

Significance level

  • higher the significance level, higher the power

  • if we increase alpha, don’t need as extreme of a t-score to reject the null

  • if alternative is true (truly is a difference), it is easier to discover it

    • Caveat: increasing alpha increases type I error

Effect size

  • higher the effect size, higher the power

  • easier to detect big difference compared to small difference

Standard deviation

  • higher the standard deviation, lower the power

    • large standard deviation: hard to detect an effect if it does exist

    • small standard deviation: easier to detect difference

8
New cards

The problem of multiple comparisons + Bonferroni correction

The more t-tests you do, the greater the chance of making at least one false positive

  • if significance level = 5%, each individual test has 5% chance of being a false positive (type I error)

Bonferroni Correction: divide alpha (ie 5%) by number of comparisons to determine a Bonferroni-corrected alpha

  • ex. if you perform 10 t-tests, new alpha threshold is 0.05/10 = 0.005

    • for any individual comparison to be significant, p-value must be < 0.005

9
New cards

P-hacking and Preregistration

P-hacking: repeatedly re-analyzing your data until you find a statistically significant result

  • undermines reproducibility of scientific research

  • perform more comparisons = significantly increase chance of at least one type I error (false positive)

Preregistration: specifying your study design, hypotheses, and analysis plans before collecting data

  • holds yourself accountable to original research objectives, reduces temptation to p-hack

  • avoids manipulating data to fit the narrative you want

10
New cards

Analysis of Variance (ANOVA)

Compare all the means simultaneously in one test: (compares 3+ groups)

  • Null: the mean is the SAME across all three groups

  • Alternative: the mean is NOT the same

    • at LEAST one group is different from the rest

To perform an ANOVA, R calculates:

  • an F statistic

  • a p-value, which tells us if F statistic large enough to be statistically significant

    • large F statistic = big deviation from null = low p-value

11
New cards

Variance and the F-statistic

Variance within each group: members of each group will differ due to random variation

Variance between groups: how different are sample means from each other?

F statistic = variance between groups / variance within groups

  • if F-statistic much greater than 1:

    • reject the null!

    • variance between groups are high and within groups are smaller; within each group, people are relatively similar

  • larger the F statistic, more likely we are to reject the null

  • smaller F statistic suggests our data is more consistent with the null

12
New cards

Analysis of Variance, R outputs

If p-value is 1.99e^-12:

  • probability of getting an F-statistic as large as ours, IF NULL IS TRUE

  • since p-value is < 0.05, we reject the null: at least one group is statistically different from the others

However, ANOVA by itself doesn’t tell you which group(s) is/are different

13
New cards

Tukey’s post-hoc Test and Interpretation

Post-hoc test: statistical test performed after initial test (ie, ANOVA) to find which pairs of groups are significantly different

Tukey’s post-hoc test: most common after ANOVA, if you reject the null

  • first, run ANOVA; if reject null:

  • run post-hoc test

    • TukeyHSD(result)

    • HSD stands for honestly significant difference

If p-value between groups are < 0.05, there is a significant difference!

  • draw significance bars to indicate statistical significance between groups

<p>Post-hoc test: statistical test performed after initial test (ie, ANOVA) to find which pairs of groups are significantly different </p><p>Tukey’s post-hoc test: most common after ANOVA, if you reject the null</p><ul><li><p>first, run ANOVA; if reject null: </p></li><li><p>run post-hoc test </p><ul><li><p>TukeyHSD(result)</p></li><li><p>HSD stands for honestly significant difference</p></li></ul></li></ul><p>If p-value between groups are &lt; 0.05, there is a significant difference!</p><ul><li><p>draw significance bars to indicate statistical significance between groups</p></li></ul><p></p>
14
New cards

Verifying normality in ANOVA

For ANOVA, we try to verify the data meet the assumptions using AS FEW TESTS as needed

  • within each group, determine the residuals

    • Residuals: difference between each data point and its group mean

    • analyzing residuals can show non-normality

Steps:

  • 1) use aov function and save output to object

  • 2) extract residuals as vector

  • 3) make plot of residuals

<p>For ANOVA, we try to verify the data meet the assumptions using AS FEW TESTS as needed</p><ul><li><p>within each group, determine the residuals</p><ul><li><p>Residuals: difference between each data point and its group mean</p></li><li><p>analyzing residuals can show non-normality</p></li></ul></li></ul><p>Steps: </p><ul><li><p>1) use aov function and save output to object</p></li><li><p>2) extract residuals as vector</p></li><li><p>3) make plot of residuals</p></li></ul><p></p>
15
New cards

Non-parametric tests

Sometimes, your data may never meet the assumptions of a t-test/ANOVA (parametric test) regardless of transforming

  • can use non-parametric tests, which do not assume normality

    • tests other attributes of data, ie median, rank order

    • generally have lower power

Paired t-test → Wilcoxon signed rank test

2-sample t-test → Mann-Whitney U test

ANOVA with Tukey’s post-hoc test → Kruskal-Wallis test with Dunn’s post-hoc test

16
New cards

Covariance

After finding (x - mean) (y - mean) for every data point, we calculate average

  • more points in positive quadrants, positive covariance = positive association

  • more points in negative quadrants, negative covariance = negative association

  • equally many points in both positive and negative quadrants = zero association

Only sign of covariance is meaningful:

  • same data, different units → different covariance

  • covariance changes even though underlying relationship between variables did not change

17
New cards

Pearson correlation coefficient

Normalize covariance by using r, called Pearson correlation coefficient

  • is not sensitive to units

  • r tells us if two variables rise and/or fall together

Does NOT imply a causal relationship between two variables!

  • r only tells us whether two variables rise and/or fall together

<p>Normalize covariance by using <em>r</em>, called Pearson correlation coefficient</p><ul><li><p>is not sensitive to units</p></li><li><p><em>r</em> tells us if two variables rise and/or fall together</p></li></ul><p>Does NOT imply a causal relationship between two variables!</p><ul><li><p><em>r</em> only tells us whether two variables rise and/or fall together </p></li></ul><p></p>
18
New cards

Correlation vs Regression

Correlation: goal is to quantify the association between two numerical variables

  • NOT trying to fit a line to the data

  • JUST seeing if two variables rise/fall together

Regression: goal is to predict the value of one variable from another

  • clearly defined independent and dependent variable

  • we ARE trying to fit a straight line to the data

19
New cards

Linear model/regression model

Goal: predict the dependent variable from the independent variable

  • draws straight line through the data, creates equation y = mx + b

Sample statistics (b1 and b0): estimate the population parameters (ß1 and ß0)

class 20

20
New cards

Least Squares Approach

Residual: difference between each y-value and model’s prediction

  • across all data points, calculate sum of squared residuals

  • better the model fits our data, the smaller the sum of squared residuals

    • y-value should be closer to model’s prediction

Best fit line (aka regression line): one that minimizes sum of squared residuals

  • say this line was obtained using the “least squares approach”

21
New cards

Slope of the regression line

Slope is the biology!

  • ex. plant mass = 0.92 x nitrogen levels + 10.7

  • for every 1 ppm increase in nitrogen, plant mass increases by 0.92g

Y-intercept has no biological significance

  • you should avoid extrapolating too far beyond the data

The better our model, the lower SS(model) is compared to SS(mean)

  • SS measures error of the model

  • Lower SS(model) → our model’s predictions are closer to the actual data

class 20

22
New cards

R² and its meaning

R²: proportion of the total variance of y that is explained by x

The sum of square went down from 16.05 → 8.58, 47% decrease!

  • there is a 47% decrease in error when we take nitrogen levels into account

  • 47% of the total variance in plant mass is explained by the soil nitrogen content

SS(mean) is often called SS(total)

  • represents the total variability in the y variable

SS(model) is often called SS(residuals)

  • represents the residuals of the final model

<p>R²: proportion of the total variance of y that is explained by x</p><p>The sum of square went down from 16.05 → 8.58, 47% decrease!</p><ul><li><p>there is a 47% decrease in error when we take nitrogen levels into account </p></li><li><p>47% of the total variance in plant mass is explained by the soil nitrogen content </p></li></ul><p>SS(mean) is often called SS(total)</p><ul><li><p>represents the total variability in the y variable</p></li></ul><p>SS(model) is often called SS(residuals)</p><ul><li><p>represents the residuals of the final model</p></li></ul><p></p>
23
New cards

Null distribution of b1 values + comparing t-score with the null

theoretical distribution of all possible b1 values IF THE NULL IS TRUE

  • due to random sampling, even when the null is true, the slope of our sample (b1) won’t be exactly zero

  • the larger the magnitude of the t-score, the more likely we are to reject the null

Comparing our t-score with the null

  • if null is true, we expect the t-score to be close to zero

  • p-value: probability of getting OUR slope / OUR t-score or a more extreme value of the null is true

  • if p < alpha, REJECT the null

<p>theoretical distribution of all possible b1 values IF THE NULL IS TRUE </p><ul><li><p>due to random sampling, even when the null is true, the slope of our sample (b1) won’t be exactly zero</p></li><li><p>the larger the magnitude of the t-score, the more likely we are to reject the null </p></li></ul><p>Comparing our t-score with the null </p><ul><li><p> if null is true, we expect the t-score to be close to zero</p></li><li><p>p-value: probability of getting OUR slope / OUR t-score or a more extreme value of the null is true</p></li><li><p>if p &lt; alpha, REJECT the null </p></li></ul><p></p>
24
New cards

Interpreting linear regression in R

Only care about the slope!

Since p-value of bite slope (11.677) is 0.0393:

  • We reject the null, there is a relationship that exists between the variables

  • IF THE NULL WERE TRUE, we have a 0.039 chance of getting a t-score (or a slope) as extreme as ours

  • p < alpha

<p>Only care about the slope! </p><p>Since p-value of bite slope (11.677) is 0.0393: </p><ul><li><p>We reject the null, there is a relationship that exists between the variables</p></li><li><p>IF THE NULL WERE TRUE, we have a 0.039 chance of getting a t-score (or a slope) as extreme as ours</p></li><li><p>p &lt; alpha</p></li></ul><p></p>
25
New cards

Overfitting

when model tries TOO hard to match the data

  • even if the model perfectly describes the data we have, it will perform poorly on new, unseen data

  • we never expect R² to be 1

  • if you draw a squiggly line, unlikely that it is the true relationship

<p>when model tries TOO hard to match the data</p><ul><li><p>even if the model perfectly describes the data we have, it will perform poorly on new, unseen data</p></li><li><p>we never expect R² to be 1</p></li><li><p>if you draw a squiggly line, unlikely that it is the true relationship</p></li></ul><p></p>
26
New cards

Comparing Residuals

Residuals: difference between each measurement and model

  • when relationship is truly linear, some points are above and some are below the regression line

    • there is no pattern to the residuals; are normally distributed

  • when relationship is non-linear, there is a pattern to the residuals

    • residuals are NOT normal

27
New cards

Transforming residuals for regression

1) Plot the residuals and run a KS test on the residuals

  • if normal, linear regression is valid

  • if non-normal:

2) make a histogram of the x and y variables

  • is one variable clearly skewed?

  • is one variable mostly normal but has a clear outlier?

    • transform skewed variable or remove the outlier

3) run a KS test on the new model residuals

28
New cards

Chi-squared goodness-of-fit test

Used to perform hypothesis tests on categorical data:

Step 1: calculate a chi-squared (x²) test statistic

Step 2: predict the null distribution of x²

Step 3: compare our value of x² to the null distribution

Basically, compare observed frequency of category to expected frequency of category

<p>Used to perform hypothesis tests on categorical data: </p><p>Step 1: calculate a chi-squared (x²) test statistic</p><p>Step 2: predict the null distribution of x²</p><p>Step 3: compare our value of x² to the null distribution</p><p>Basically, compare observed frequency of category to expected frequency of category</p><p></p>
29
New cards

Statistical Tests Summary

One sample t-test: 1 numerical

Paired t-test: two categories; 1 numerical, 1 categorical

Two-sample t-test: two categories; 1 numerical, 1 categorical

ANOVA: 3+ categories; 1 numerical, 1 categorical

Linear regression: 2 numerical

Chi-squared goodness of fit: 1 categorical

Correlation is also appropriate for quantifying association – it does NOT attempt to predict Y from X

30
New cards

Manipulative vs Natural Experiments

Manipulative: experimenter actively controls the independent variable

  • more directly establish causality between independent and dependent variabels

  • however, might not be practically and/or ethically feasible

Natural/Observational: experimenter relies on pre-existing differences in independent variable

  • challenging to implement control group, randomly assigning subjects to treatment vs control groups, and blinding

31
New cards

Random vs Systematic Error

Random: fluctuations in our data that occur just by chance

  • ie natural variation in blood pressure between individuals

Systematic: error that consistently skews our data in one direction

  • ie meditation group starts to eat more healthily and exercise more

  • includes confounding variables, non-random assignment, experimental bias

32
New cards

Minimizing systematic error

Control groups: isolate the effect of the treatment (IV) on DV

  • to minimize confounding variables, keep control group maximally similar to treatment group

Random assignment: each experimental subject should have equal chance of being assigned to treatment or control group

Blinding: if participants or researcher are aware of treatment, may consciously or subconsciously influence their behavior

  • Single-blind: participants unaware if they have been assigned to treatment or control

  • Double-blind: BOTH participants AND researchers are unaware who is assigned to treatment or control

33
New cards

Sample Size and Random Error

If we only give each cream to one person, exposes us to LOTS of random error

  • any observed difference could be significantly affected by natural variability between the two people

When you increase sample size, you give the skin creams to more people

  • increasing sample size reduces the influence of random error

    • averages out random variation across more people

  • any one individual now exerts a weaker pull on the overall mean

When increasing sample size, standard error of the mean decreases!

  • smaller n → larger SE

  • larger n → smaller SE

    • standard error is a measure of the random error!

34
New cards

Experimental Unit + Between-subjects vs. within-subjects

An individual or group that is assigned a treatment independently of every other unit

  • Between-subjects / between-groups: each experimental unit experiences EITHER one condition OR the other

    • unpaired

    • independent groups design

  • Within-subjects / within-groups: each experimental unit experiences BOTH conditions

    • paired

    • repeated measures design

35
New cards

Sample size and replicates (biological vs technical)

Sample Size: total number of experimental units across entire experiment

Biological replicates: number of experimental units that experience each condition

  • captures biological variability

Technical replicates: number of measurements you take per experimental unit

  • increases precision of measurement

  • does NOT capture any additional biological variation

<p>Sample Size: total number of experimental units across entire experiment</p><p>Biological replicates: number of experimental units that experience each condition</p><ul><li><p>captures biological variability</p></li></ul><p>Technical replicates: number of measurements you take per experimental unit</p><ul><li><p>increases precision of measurement</p></li><li><p>does NOT capture any additional biological variation</p></li></ul><p></p>
36
New cards

Pseudoreplication and how to avoid it

Pseudoreplication: when you erroneously report the sample size as being higher than it actually is

  • ex. if you take 4 measurements per plant, you think sample size = 6 plants x 4 measurements = 24

    • not correct! 4 measurements are technical replicates

To avoid pseudoreplication:

  • Ask yourself: how do I increase sample size?

    • add more plants!

      • adding more experimental units / biological replicates

    • DO NOT take more measurements from each plant!

      • adding more technical replicants!

  • increasing sample size = capturing more biological variability

37
New cards

Measurement Validity

Does my experimental system actually address the research question that I’m asking?

To validate, you need additional control groups:

  • Negative control: checks if experimental system can detect a lack of change when we expect it to

    • ie mice receive an injection of DMSO solvent WITHOUT a supplement

    • establish baseline comparison WITHOUT any treatment

  • Positive control: checks if experimental system can detect a change when we expect it to

    • ie mice receive an injection of a supplement KNOWN to raise blood insulin levels

    • verifies our system CAN detect a postive result

For both positive and negative controls, we KNOW what to expect

  • check experimental system is behaving as it should

38
New cards

Comparing positive and negative controls

If there is no difference between the positive and negative controls, this undermines the validity of our data.

  • injection was faulty?

  • technique for measuring insulin is broken or not sensitive enough?

  • incorrect dosage – all supplements are delivered at too low a concentration?

Explore top notes

note
AP Music Theory Ultimate Guide
Updated 1072d ago
0.0(0)
note
Human Geography Unit 5
Updated 347d ago
0.0(0)
note
Data Trends
Updated 1149d ago
0.0(0)
note
Fluids: chapter 8
Updated 480d ago
0.0(0)
note
AP Music Theory Ultimate Guide
Updated 1072d ago
0.0(0)
note
Human Geography Unit 5
Updated 347d ago
0.0(0)
note
Data Trends
Updated 1149d ago
0.0(0)
note
Fluids: chapter 8
Updated 480d ago
0.0(0)

Explore top flashcards

flashcards
Frans HCE 11
53
Updated 1094d ago
0.0(0)
flashcards
IMENICE
32
Updated 393d ago
0.0(0)
flashcards
abeka history 10 section 5.1
23
Updated 920d ago
0.0(0)
flashcards
Culture Quiz #2
24
Updated 473d ago
0.0(0)
flashcards
SAT Math Formulas
20
Updated 234d ago
0.0(0)
flashcards
Ap psych unit 1 vocab
36
Updated 933d ago
0.0(0)
flashcards
Frans HCE 11
53
Updated 1094d ago
0.0(0)
flashcards
IMENICE
32
Updated 393d ago
0.0(0)
flashcards
abeka history 10 section 5.1
23
Updated 920d ago
0.0(0)
flashcards
Culture Quiz #2
24
Updated 473d ago
0.0(0)
flashcards
SAT Math Formulas
20
Updated 234d ago
0.0(0)
flashcards
Ap psych unit 1 vocab
36
Updated 933d ago
0.0(0)