Biostatistics Exam 3

5.0(1)

Studied by 5 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/65

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

66 Terms

New cards

Two sample design

Two groups
Each treatment group composed of independent, random sample units
Wild type vs. control, drug vs. placebo, where treatments are applied to separate and independent samples

New cards

Paired design

Two groups
Each sampled unit receives both treatments
Both treatments applied to every sampled unit
More powerful b/c control for variation among sampling units
Not as common
Two measurements from same sampling units —> converted to single measurement by taking difference between them
Examples (patient weight before and after hospitalization, effects of sunscreen on one arm vs placebo on another arm, effects of environment on identical twins raised under different socioeconomic conditions)

New cards

Estimating mean difference

From sample of d_i
- d = after-before

New cards

Paired t-test

Used to test the null hypothesis that the mean difference of paired measurements equals a specific value
H₀: mean change in antibody production after testosterone implants was 0 (mu_d=0)
H_A: mean change in antibody production after testosterone implants was not 0

New cards

Paired t-test statistic

One paired samples reduced to single measurement (d)
Calculation is same as one-sample t-test
Can determine P with computer or statistical table
Fail to reject the null hypothesis that the mean change in antibody production after the testosterone implant is 0

New cards

Assumptions of paired t-test

Sampling units are randomly sampled from the population
Paired differences have a normal distribution in the population

New cards

Formal tests of normality

H₀: sample has normal distribution
H_A: sample does not have normal distribution
Should be used with caution
- Small sample sizes lack power to reject a false null (Type II error)
- Large sample sizes can reject null when the departure from normality is minimal and would not affect methods that assume normality

New cards

Shapiro-Wilk test

Evaluates the goodness of fit of a normal distribution to a set of data randomly sampled from a population
Most commonly used formal test of normality
Estimates mean and standard deviation using sample data
Tests goodness-of-fit between sample data and normal distribution (with mean, sd of the sample)

New cards

Pooled sample variance (s²p)

The average of the variances of the samples weighted by their degrees of freedom

New cards

Two-sample t-test

Simplest test to compare the means of a numerical variable two independent groups
Most commonly…
- H₀: mu₁=mu₂
- H_A: mu₁ does not equal mu₂

New cards

Assumptions of two-sample t-test

Each of two samples is a random sample from its population
Numerical variable is normally distributed in each population
- Robust to minor deviations from normality
- Need to run Shapiro-Wilk test on both samples
Standard deviation (and variance) of the numerical variable is the same in both populations
Robust to some deviation from this if sample sizes of two groups are approximately equal

New cards

Formal tests of equal variance

An F-test is sometimes used, but it is highly sensitive to departures from the assumption that the measurements are normally distributed in the population
Levene’s test performs better and is recommended
- H₀: variances of the two groups are equal
- H_A: variances of the two groups are not equal
- Can be extended to more than two groups

New cards

What if variances in two groups are not equal?

Standard t-test works well if both sample sizes are greater than 30 and there is less than 3 fold difference in standard deviations
Welch’s t-test compares the means of two groups and can be used even when the variances of the two groups are not equal
- Slightly less power compared to standard t-test
- Should be used when the sample standard deviations are substantially different
- Formulae are different than standard two-sample t-test

New cards

Correct sampling units

When comparing the means of two groups, an assumption is that the samples being analyzed are random samples
Often, repeated measurements are taken on each sampling unit
Makes the identification of independent units more challenging

New cards

Fallacy of indirect comparison

Compare each group mean to hypothesized value rather than comparing group means to each other
Comparisons between two groups should always be made directly, not indirectly by comparing both to the same hypothesized value
Ex: since group 1 is significantly difference than zero, but group 2 is not, then groups 1 and 2 are significantly different from each other

New cards

Interpreting overlap in confidence intervals

Papers often report means and confidence intervals for two or more groups without running a two-sample t-test

<ul><li><p>Papers often report means and confidence intervals for two or more groups without running a two-sample t-test</p></li></ul>

New cards

t-distribution

Similar to standard normal distribution (Z), but with fatter tails
As the sample size increases the t distribution becomes more like the standard normal distribution
Has critical values like Z
Get values with computer or table

New cards

One-sample t-test

Compares the mean of a random sample from a normal population with the population mean proposed in a null hypothesis
H₀: the true mean equals mu₀
H_A: the true mean does not equal mu₀

New cards

Interpreting t-statistic

Compute P-value: probability of this t-statistic (or more extreme) given the null hypothesis is true
Using a stats table
- Look up critical t-value
- Observed value is within range of -/+ critical value
- Data consistent with true null

New cards

Increasing sample size

Increasing sample size reduces standard error of mean
- Uncertainty of estimate of mean
Larger sample sizes increase probability of rejecting a false null hypothesis (power)
If this null is really false, then the sample of 25 failed to detect a false null (Type II error)

New cards

Assumptions of one-sample t-test

Data are a random sample from the population
Variable is normally distributed in the population
- Few variables in biology are exact match to normality
- But in many cases the test is robust to departures from normality

New cards

Estimating other statistics

Emphasis on estimating the mean of a normal population
Spread of the sample distribution (standard deviation or variance)
Confidence limits for variance is based on the X² distribution

New cards

Assumptions of calc confidence intervals for variance

Random sample from the population
Variable must have normal distribution
- Formulas are not robust to departures from normality

New cards

Normal quantile plot

Compares each observation in the sample with its quantile expected from the standard normal distribution. Points fall roughly along a straight line if the data come from a normal distribution

New cards

Ignoring violations of normality

t-tests assume data are drawn from a population have a normal distribution
But can sometimes be used when data are not normal
- Central limit theorem
When sample sizes are large the sampling distribution of means behaves roughly as assumed by t-distribution
Large sample size depends on shape of the distribution
- If distributions of two groups being compared are skewed in different directions, then avoid t-tests even for large samples
- If distributions are similarly skewed then there is more leeway

New cards

Ignoring violations of equal standard deviations

Two sample t-tests assume standard deviations in the two populations
If sample sizes are >30 in each group AND sample sizes in two groups are even, then even up to a 3x difference can be ok
Otherwise, use Welch’s t-test

New cards

Data transformations

A data transformation changes each measurement by the same mathematical formula
Can make standard deviations more similar and improve fit of the normal distribution to the data
All observations must be transformed
If two samples then they both must be transformed in same way
If used then usually best to back transform confidence intervals

New cards

Log transformation

Data are transformed by taking the natural log (ln) or sometimes log base-10 of each measurement
Common uses:
- Measurements are ratios or products
- Frequency distribution skewed to the right
- Group having larger mean also has larger standard deviation
- Data span several orders of magnitude

New cards

Other transformations

Arcsine (best use: proportions)
Square-root (counts)
Square (skewed left)
Antilog (skewed left)
Reciprocal (skewed right)

New cards

Nonparametric alternatives

A nonparametric method makes fewer assumptions than standard parametric methods do about the distributions of the variables
- Can be used when deviations from normality should not be ignored, and sample remains non-normal even after transformation
Do not rely on parametric statistics like mean, standard deviation, variance
Usually based on ranks of the data points rather than the actual values

New cards

Sign test

Compares the median of a sample to a constant specified in the null hypothesis. It makes no assumptions about the distribution of the measurement of the population
Each measurement is characterized as above (+) or below (-) the null hypothesis
If the null is true, then you expect the half the measurements to be + and half to be -
Uses binomial distribution to test if the proportion of measurements above the null hypothesis is p = 0.5

New cards

Wilcoxian sign-ranked test

More power than standard sign test because information about the magnitude away from the null for each data point
But test assumes that population is symmetric around the median (i.e., no skew)
Nearly as restrictive as normality assumption, thus not recommended

New cards

Mann-Whitney U-Test

Nonparametric test for two samples
Compares the distributions of two groups. It does not require as many assumptions as the two-sample t-test
What if you have tied ranks?
- Assign all instances of the same measurement the average of ranks that the tied points would have received
Mann-Whitney U-test further explained from office hours:
inability to do two sample t-test due to lack of normal distribution
e.g. right skew failed shapiro test for group 1 and group 2, but group 2’s distribution looks a bit different
takes all data points and ranks them from low to high
- null is that distribution of ranks is equal in group 1 and group 2
  - sprinkling of black and green along the line would be equal visually
e.g. now imagine that right skew for group 1 and left skew for group 2
- lowest is only black then some green, then you get to the highest and its mostly/all green with little black
- this would show the alternative hypothesis that distribution of ranks is not equal
  - p<0.05

New cards

Assumptions of nonparametric tests

Still assume that both samples are random samples from their populations
Wilcoxian signed-rank test assumes distributions are symmetrical (big limitation-not recommended)
Rejecting null hypothesis of Mann-Whitney U-test means two groups have different distributions of ranks, but does not necessarily imply that means of medians of groups differ
To make this inference there is an assumption that the shapes of the distributions are similar

New cards

ANOVA

Analysis of variance (ANOVA) compares the means of multiple groups simultaneously in a single analysis
- Tests for variation of means among groups
H₀: mu₁ = mu₂ = mu₃…mu_n
H_A: mean of at least one group is different from at least one other group
Null assumption that all groups have the same true mean is equivalent to saying that each group sample is drawn from the same population
But each group sample is bound to have a different mean due to sampling error
ANOVA determines if there is more variance among sample means than we would expect by sampling error alone
- Two measures of variation
Test statistic is a ratio:
- True null: MS_groups/MS_error = 1
- False null: MS_groups/MS_error > 1

New cards

Group mean square (MS_groups)

Proportional to the observed amount of variance among group sample means
- Variation among groups

New cards

Error mean square (MS_error)

Estimates the variance among subjects that belong to each group
- Variation with groups

New cards

ANOVA calculations

Sums of squares (SS) calculates two sources of variation (among and within groups)
Grand mean
- Equal to a’ constant
Group mean square (MS_groups) is variation among
Group error square (MS_error) is variation among individuals in same group
F-ratio test statistic
- Has pair of degrees of freedom
  - Numerator and denominator
- Use to calculate P-value with stats table or computer

New cards

Variation explained

R² measures the fraction of variation in Y that is explained by group differences

New cards

Assumptions with variation

Measurements in every group represent a random sample from the corresponding population
Variable is normally distributed in each of the k populations
- Robust to deviations, particularly when sample size is large
Variance is the same in all k populations
- Robust to departures if sample sizes are large and balanced, and no more than 10x differences among groups

New cards

Alternatives with variation

Test normality with Shapiro-Wilk and test equal variances with Levene’s test
Data transformations can make data more normal and variances more equal
Nonparametric alternative: Kruskal-Wallis test
- Similar principle as Mann-Whitney U-test

New cards

Planned comparisons

A planned comparison is a comparison between means planned during the design of the study, identified before the data are examined
In circadian clock follow-up study, the planned (a priori) comparison was difference in means between knee and control group

New cards

Unplanned comparisons

Comparisons are unplanned if you test for differences among all means
Problem of multiple tests (increasing probability of Type I error) should be accounted for
With the Tukey-Kramer method the probability of making at least one Type I error throughout the course of testing all pairs of means is no greater than the significance level

New cards

Tukey-Kramer method

Works like a series of two-sample t-tests, but with a higher critical value to limit the Type I error rate
- Because multiple tests are done, the adjustment makes it harder to reject the null

New cards

Kruskal-Wallis post-hoc test

Suppose that your data…
- Fail normality even after transformation
- Generate a significant Kruskal-Wallis result
So the interpretation is that the distribution of ranks differs for at least one group. But which one?
Should not use Tukey-Kramer, which is a parametric test

New cards

Dunn’s test

The appropriate analysis for a post-hoc analysis of groups following a significant Kruskal-Wallis result
- Will compare all possible pairs of groups while controlling for multiple tests

New cards

Correlation

When two numerical variables are associated then they are. correlated

New cards

Correlation coefficient

The correlation coefficient measures the strength and direction of the association between two numerical variables
- AKA linear correlation coefficient or Pearson’s correlation coefficient
Correlation coefficient (statistic), r
Population correlation coefficient (parameter), p
Ranges from -1 to 1
Possible that two variables can be strongly associated but have no correlation (r=0)
- Non-linear association

New cards

Standard error of correlation coefficient

Sampling distribution of r is not normally distributed, so SE_r isn’t used in calculating the 95% CI

New cards

Approx. confidence. interval of correlation coefficient

Involves conversion of r that includes natural log, and then back conversion

New cards

Correlation assumptions

Random sample from the population
Bivariate normal distribution
- Bell shaped in two dimensions rather than one

New cards

Deviations from bivariate normality

Transform data (both variables same way)
Nonparametric test (Spearman’s rank correlation); skipping

New cards

ANOVA as a linear model

Linear model Y=+A
- Y: response
- : grand mean (constant)
- A: treatment
Circadian clock study: SHIFT = CONSTANT + TREATMENT
H₀: treatment means are all the same
- SHIFT = CONSTANT
H_A: treatment means are not all the same
- SHIFT = CONSTANT + TREATMENT
Even if the null hypothesis is true, however, the treatment means will be different due to sampling error
Thus, the full model (with TREATMENT) will be a better “fit” to the data
The F-ratio is used to test whether including the treatment variable in the model results in a significant improvement in the fit of the model to the data
- Compared with the fit of the null model lacking the treatment variable
ANOVA linear model: RESPONSE = CONSTANT + EXPLANATORY
Extending for multiple explanatory variables
- RESPONSE = CONSTANT + EXP1 +EXP2 + EXP1 * EXP2
- Design is called a two-way ANOVA or two-factor ANOVA

New cards

Two-factor outcomes

New cards

Assumptions of two-factor ANOVA linear models

The measurements at every combination of values for the explanatory variables are a random sample from the population of possible measurements
The measurements for every combination of values for the explanatory variables have a normal distribution in the corresponding population
The variance of the response variable is the same for all combinations of the explanatory variables

New cards

Correlation vs regression

Correlation measures the aspects of the linear relationship between two numerical variables
Regression is a method that predicts values of one numerical variable from values of another numerical variable
Fits a line through the data
- Used for prediction
- Measures how steeply one variable changes with the other

New cards

Linear regression

The most common type of regression
- Although there are non-linear models (e.g., quadratic, logistic)
Draws a straight line through the data to predict the response variable (Y, vertical axis) from the explanatory variable (X, horizontal axis)
Fitting the “best” line
- You want a line that gives the most accurate predictions of Y from X
- Least-squares regression: line for which the sum of all the squared deviations in Y is the smallest
Formula for the line
- Y = a + bX
  - a is the Y-intercept; b is the slope
  - The slope of a linear regression is the rate of change in Y per unit X
  - Also measures direction of prediction
    - Positive: as X increases Y increases
    - Negative: as X increases Y decreases

New cards

Calculating intercept

Once slope is calculated, getting intercept is straightforward because the least-squares regression always goes through point (X, Y)
- Plug mean values into line formula: Y = a + bX
- Rearrange to solve for intercept: a = Y - bX

New cards

Samples vs populations

The slope (b) and intercept (a) are estimated from a sample of measurements, hence these are estimates/statistics
The true population slope () and intercept () are parameters
Regression assumes that there is a population for every value of X, and the mean Y for each of these populations lies on the regression line

New cards

Predicting values

Now that you have the regression line you can predict values of Y for any specified value of X
Predictions are mean Y for all individuals with value X
Designated Y, or “Y-hat”

New cards

How well do data fit line?

The residual of a point is the difference between its measured Y value and the value of Y predicted by the regression line
Residuals measure the scatter of points above and below the least-squares regression line
Can be positive or negative
Variance in residuals (MS_residual) quantifies the spread of the scatter
- Residual mean square
- Analogous to error mean square in ANOVA
Used to quantify the uncertainty of the slope

New cards

Two types of predictions

Predict mean Y for a given X
- E.g., what is the mean age of all male lions whose noses are 60% black?
Predict single Y for a given X
- E.g., how old is that lion over there with a 60% black nose?
Both predictions give the same value of Y, but they differ in precision
- Can predict mean with more certainty than a single value
Confidence bands measure the precision of the predicted mean Y for each value of X.
Prediction intervals measure the precision of the predicted single Y-values for each X

New cards

ANOVA (F) approach

Recall two source of variation in ANOVA
- Among groups (MS_groups)
- Within groups (MS_error)
In regression framework:
Deviations between the predicted values Y_i and Y
- Analogous to MS_groups
Deviations between each Y_i and its predictive value Y_i
- Analogous to MS_error
Using ANOVA approach will generate the same P-value as the t-test approach
Can be used to measure R^2: the fraction of the variation in Y that is “explained” by X
R^2 = SS_regression/SS_total

New cards

Regression toward the mean

Regression toward the mean result when two variables measured on a sample of individuals have a correlation less than one. Individuals that are far from the mean for one of the measurements will, on average, lie closer to the mean for the other measurement
Cholesterol measurements before and after drug
Solid line: linear regression
Dashed line: one-to-one line with slope of 1

New cards

Assumptions of linear regression

At each value of X:
- There is a population of Y-values whose mean lies on the regression line
- The distribution of possible Y-values is normal (with same variance)
- The variance of Y-values is the same at all values of X
- The Y-measurements represent a random sample from the possible Y-values

New cards

Detecting issues

Outliers
If only one (or a low number) then it may be reasonable to report regression with and without outlier
Nonlinearity can be detected by inspecting graphs
Non-normality and unequal variances can be inspected with a residual plot
Residual plot: residual of every data point (Y_i - Y_i) is plotted against X_i
If assumptions of normality and equal variances are met then there should be a roughly symmetric cloud above/below horizontal line at 0