WILD 240 Biostatistics: Final Exam

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/160

There's no tags or description

Looks like no tags are added yet.

Last updated 9:22 PM on 5/4/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

161 Terms

New cards

Draw a t-distribution with a mean of 0 and 95% confidence intervals of -2 and 2. Imagine that you conducted five t-tests, and obtained t-statistics from those tests of -3, -1, 0, 1, and 3. Add these t-statistics to your figure, and clearly indicate which of these t-tests would be significant with alpha = 0.05 (i.e., a p-value of less than 0.05)?

The t-statistics of -3 and 3 would be significant because they are outside the bounds of -2 and 2.
The t-statistics of -1, 0, and 1 would not be significant because they fall within the 95% confidence intervals

<ul><li><p>The t-statistics of -3 and 3 would be significant because they are outside the bounds of -2 and 2. </p></li><li><p>The t-statistics of -1, 0, and 1 would not be significant because they fall within the 95% confidence intervals</p></li></ul><p></p>

New cards

what is a t-distribution?

the expected t-statistic you would get assuming the null hypothesis is true

New cards

what is a p-value?

the probability of obtaining a t-statistic at least as extreme as the observed t-statistic if the null hypothesis were true

New cards

what is a scalar?

a single number, lowercase & unbolded

ex: n = 1

New cards

what is a vector?

a one-dimensional data structure that holds ordered data of the same type

multiple numbers & bolded

ex: n = [660, 825, 1000, 1150, 1250]

New cards

what is a matrix?

a two-dimensional (vectors of vectors) data storage device

M is uppercase & bolded

ex: M =

[1 0 0

0 1 0

0 0 1]

New cards

what is an array?

can be 3-dimensional or n-dimensional

ex: M[3, 3, 3] or M =

[1 4 7

2 5 8

3 6 9]

(these are both arrays)

New cards

what does each number represent in M[2, 4, 6]?

2 = one-dimension, row

4 = two-dimensional, column

6 = third-dimensional, “turning the page”

New cards

what is M[2,2] for M =

[1 4 7

2 5 8

3 6 9]

M[2,2] = 5 because it is in the second row and second column

New cards

analogy for 2D indexing

just like coordinates (x,y)

New cards

analogy for 3D indexing

row, column, page

New cards

what is the value of y[3], if y = [1,6,4,3,5,9,11]?

y[3] = 4, because it is in the third position

New cards

what is the value of Y[2,2,1], and how many dimensions does y have?

Y[2,2,1] = 4

Y has three dimensions because it is an array

New cards

what is the median?

the middle of a data set (has to be in order)

ex: y = [1,2,3,4,5,6,7,8,9]

median = 5

New cards

what is the mean?

the average of a data set

(represented by ȳ)

New cards

what does it mean if the mean and median are very close or the same?

the data set is likely proportionately dispersed

New cards

what is the mode?

the most common number in a data set

ex: y = [1,1,1,3,4,4,7]

mode = 1

New cards

what is the mean of the following set of numbers? y = [3,7,16,200,4]

New cards

what is the median of the following set of numbers? y = [3,7,16,200,4]

y = [3,4,7,16,200]

median = 7

New cards

what do the numbers represent in y ~ Normal(400,50)?

400 represents the mean

should be in the middle of the bell curve

50 represents the standard deviation

determines what the numbers go down/up by
ex: it would be 300, 350, 400, 450, 500

New cards

what are the percentages for bell curves?

68% between -1 and 1

95% between -2 and 2

99% between -3 and 3

New cards

what is an intercept?

what y equals when x=0

New cards

what does an intercept represent?

serves as the baseline or starting point of the regression line on the y-axis

New cards

what is the slope?

the amount the dependent variable will change if the covariate increases by 1 (rise over run)
the relationship between x and y on a graph

New cards

what is the equation for the lower confidence interval (LCI)?

slope - 2(standard error)

New cards

what is the equation for the upper confidence interval (UCI)?

slope + 2(standard error)

New cards

<p>Here is an estimate of a beta parameter (i.e. a slope) with 95% confidence intervals. Is this a “significant” (p < 0.05) effect? Is the effect positive or negative?</p>

Here is an estimate of a beta parameter (i.e. a slope) with 95% confidence intervals. Is this a “significant” (p < 0.05) effect? Is the effect positive or negative?

This is a significant effect because zero is not included in the interval. It is also positive because the slope is above zero.

slope = 1

UCI = 1.3

LCI = 0.7

New cards

<p class="p1">Researchers conducted a radio-telemetry study of three ungulate species in Banff National Park. They then compared environmental covariates at telemetry locations with random locations generated within an estimate of each individual's home range. They observed relationships between elevation and mule deer (B = -0.0025), elk (B = -0.0035), and mountain goat (B = 0.005) winter habitat selection. Which line on the inset figure of the probability of use was mule deer? Which was elk? Which was mountain goat?</p>

Researchers conducted a radio-telemetry study of three ungulate species in Banff National Park. They then compared environmental covariates at telemetry locations with random locations generated within an estimate of each individual's home range. They observed relationships between elevation and mule deer (B = -0.0025), elk (B = -0.0035), and mountain goat (B = 0.005) winter habitat selection. Which line on the inset figure of the probability of use was mule deer? Which was elk? Which was mountain goat?

A = mountain goat because it is the only positive line

B = mule deer because the slope is more positive than C

C = elk because the slope is more negative than B

New cards

<p class="p1">Imagine that you are conducting a camera trap study of bears. Previous work has allowed you to estimate that the expected number of bears you will photograph is 2.6. If y (the outcome) is drawn from a Poisson distribution with a rate of 2.6, what is the approximate probability of observing four bears at your camera trap?</p>

Imagine that you are conducting a camera trap study of bears. Previous work has allowed you to estimate that the expected number of bears you will photograph is 2.6. If y (the outcome) is drawn from a Poisson distribution with a rate of 2.6, what is the approximate probability of observing four bears at your camera trap?

The probability of observing 4 bears with a rate of 2.6 would approximately be 0.15.

New cards

what is the most likely number of bears that will be photographed?

New cards

<p>Imagine that you have used a Poison glm to estimate the relationship between wolf pack size and winter ungulate abundance (right). At what value of winter ungulate abundance would you expect average wolf pack size to equal 6. At what value of winter ungulate abundance would you expect average wolf pack size to be significantly greater than 6?</p>

Imagine that you have used a Poison glm to estimate the relationship between wolf pack size and winter ungulate abundance (right). At what value of winter ungulate abundance would you expect average wolf pack size to equal 6. At what value of winter ungulate abundance would you expect average wolf pack size to be significantly greater than 6?

average wolf pack size to equal 6 is 3

average wolf pack size to be significantly greater than 6 is 5

New cards

Can you compare model log-likelihoods between or among two or more different datasets? Why or why not?

No, log-likelihoods cannot be compared across different datasets because they are calculated based on the specific data used. They are only meaningful when comparing models fit to the same dataset.

New cards

Imagine that you know the mean of a distribution is 5. You also know that the standard error of the estimate of the mean is 1. What are the approximate 95% confidence intervals? If you simulated 1000 replicate datasets using those parameter values and calculated the mean for each new dataset/what proportion of those datasets would you expect to have a mean within the 95% confidence intervals?

LCI = 5 - 2(1) = 3

UCI = 5 + 2(1) = 7

If 1000 datasets were simulated, about 95% of them would have means within this interval.

New cards

what is the slope and intercept of this line?

y-intercept = 0.5

slope = -0.5 → (0.5/1)

New cards

need 16

New cards

what does a low standard error signify?

a sample mean is a precise and reliable estimate of the true population mean, signifying that sample data is tightly clustered rather than widely spread

New cards

what does a high(er) standard error signify?

sample means are widely spread around the true population mean, indicating lower precision and that the sample is less representative of the population → increases uncertainty

New cards

does a smaller sample size mean more or less uncertainty?

more uncertainty

New cards

does a bigger sample size mean more or less uncertainty?

less uncertainty

New cards

Imagine that you are fairly confident that there are 100 wolf packs in Montana. You have two estimates of mean pack size. The first has a mean of 5 and a standard error of 0.1. The second has a mean of 6 and a standard error of 1. The total number of wolves in packs is a function of the number of packs and the number of wolves in each pack. What impact would the use of either estimate have on your final estimate of the total number of wolves in Montana in terms of the total number itself and associated uncertainty?

The estimate with mean 5 and SE 0.1 would result in a lower total estimate (~500 wolves) with low uncertainty.
The estimate with mean 6 and SE 1 would result in a higher total (~600 wolves) but with much greater uncertainty.
Thus, the second estimate produces a less precise estimate of total population size.

New cards

<p>Imagine that you've estimated the intercept (B<sub>0</sub> = 2.09) and slope (B<sub>1</sub> = -1.4) on the log scale from a generalized linear model. In your own words, given the figure to the right (where the solid points are plotted as a function of these two parameters), what does the intercept represent? What does the slope represent?</p>

Imagine that you've estimated the intercept (B₀ = 2.09) and slope (B₁ = -1.4) on the log scale from a generalized linear model. In your own words, given the figure to the right (where the solid points are plotted as a function of these two parameters), what does the intercept represent? What does the slope represent?

The intercept represents the expected number of bear visits at rural sites (the reference group), expressed on the log scale.
The slope represents the change in bear visits from rural to highway sites, with the negative value indicating that highway sites have fewer visits.

New cards

This equation, y_i x ln(u_i), is part of the Poisson log-likelihood function. If u_i = 1, solve this component of the equation.

ln(1) = 0, so y x 0 would equal zero.

New cards

Why would the mean and median of a vector (i.e., set or group) of numbers be substantially different? Give a quick real-life example.

outliers often affect the mean more than the median (making it bigger or smaller depending on the situation)
ex: If there was a statistical analysis of average incomes in the US, the billionaires could make the mean seem a lot higher than it actually is. The median would likely be a much lower number, possibly more accurate as well.
if the data is skewed, the mean could be very different from the median

New cards

what is the general rule for telling if t-statistics are significant or not?

it is significant if the t-statistic is less than -2 or greater than 2

New cards

<p>Here is a plot of a t-distribution (df = 97). Approximately 95% of this distribution is between -1.985 and 1.985, and 99% of the distribution is between -2.627 and 2.627. Would a t-value of -2 result in a significant (p < 0.05) result?</p>

Here is a plot of a t-distribution (df = 97). Approximately 95% of this distribution is between -1.985 and 1.985, and 99% of the distribution is between -2.627 and 2.627. Would a t-value of -2 result in a significant (p < 0.05) result?

Yes, this would be significant (p < 0.05) because -2 falls outside the 95% range.

New cards

No, this is not significant because -1.5 falls within the 95% range.

New cards

what would the p-value be if t = 0?

New cards

What is the slope and intercept of this line?

slope = 6/4 = 3/2 = 1.5

y-intercept = 0 (maybe a little more)

x-intercept = not shown, but likely -2

New cards

what is a representative sample?

a small, accurate sample of a larger population → acts as a mirror of the whole population’s characteristics

New cards

Describe a problem that might arise from a non-random (and non-representative) sample of a population.

bear visits at heavy vs. light traffic sights (from the hw)
If this sample had only been taken from heavy traffic areas, it would not have been an accurate representation of the whole population because there are more bears in lighter traffic areas.
non-random samples do not include the whole picture or the many variables at hand, making them vague and inaccurate

New cards

how do you know if a beta parameter (slope) with 95% confidence intervals is significant?

if 0 is in the interval → not significant (ex: -0.2 to 0.4)

if 0 is NOT in the interval → significant (ex: 0.1 to 0.8)

New cards

how do you know if a beta parameter (slope) with 95% confidence intervals is a positive or negative effect?

entire confidence interval is above 0 (e.g., 0.2 to 0.5) → positive effect
entire confidence interval is below 0 (e.g., -0.5 to -0.2) → negative effect
if interval includes both positives and negatives → not significant

New cards

Here is an estimate of a beta parameter (i.e., a slope) with 95% confidence intervals. Is this a “significant” (p < 0.05) effect? Is the effect positive or negative?

This is a significant effect because zero is not included in the interval (0.7 to 1.3).
This is a positive effect because the entire confidence interval is above zero (0.7 to 1.3).

New cards

<p>Here is a probability mass function for the Poisson distribution with a rate of 2. What is the probability (roughly) of obtaining a 7 given a single draw from this distribution?</p>

Here is a probability mass function for the Poisson distribution with a rate of 2. What is the probability (roughly) of obtaining a 7 given a single draw from this distribution?

The probability is extremely low, and is about a 0% of obtaining a 7. The most likely result is drawing a 2.

New cards

<p>Imagine that the number of bears recorded at a camera site (y) arises from a Poisson distribution with a known rate of 2 (i.e., y ~ Poisson(2)). Do you think it is likely that 8 bears will be photographed at this site? Why or why not?</p>

Imagine that the number of bears recorded at a camera site (y) arises from a Poisson distribution with a known rate of 2 (i.e., y ~ Poisson(2)). Do you think it is likely that 8 bears will be photographed at this site? Why or why not?

It’s not likely that 8 bears will be photographed because that is a significantly larger amount than 2. According to the graph with a rate of 2, the probability of obtaining an outcome of 8 is about 0%.

New cards

<p>Here is an estimate of the relationship between pack size and ungulate winter density with 95% confidence intervals. Is this a “significant” (p < 0.05) effect? What value of ungulate winter index would be required for pack size to equal 6? </p>

Here is an estimate of the relationship between pack size and ungulate winter density with 95% confidence intervals. Is this a “significant” (p < 0.05) effect? What value of ungulate winter index would be required for pack size to equal 6?

This is a significant effect because you cannot draw a flat line through the confidence intervals.
For pack size to equal 6, the ungulate winter index would have to be ~2.5.

New cards

You estimate a slope of 1 and uncertainty around the slope (se = 0.1). Is this a significant effect? Why or why not?

LCI = 1 - 2(0.1) = 0.8

UCI = 1 + 2(0.1) = 1.2

Yes, this is a significant positive effect because the 95% confidence interval (0.8 to 1.2) does not include 0.

New cards

You estimate a slope of 0.3 and uncertainty around the slope (se = 0.1). What are the approximate 95% confidence intervals for the estimate of the slope?

LCI = 0.3 - 2(0.1) = 0.1

UCI = 0.3 + 2(0.1) = 0.5

New cards

You estimate a slope of 0.3 and uncertainty around the slope (se = 0.1). Is this a significant effect? Why or why not?

This is a significant effect because the interval (0.1 to 0.5) does not include zero.

New cards

You estimate a slope of 0.3 and uncertainty around the slope (se = 0.1). Is the effect positive or negative? Why or why not?

This is a positive effect because the entire interval (0.1 to 0.5) is above zero.

New cards

how do you tell which log-likelihood model is better?

higher (less negative) log-likelihood = better model

ex: -2 will be a better model than -10

New cards

Imagine that you run two models, each using a different covariate to explain the same response variable. The log-likelihood of Model 1 is -811. The log-likelihood of Model 2 is -814. Which model is more likely? Why?

Model 1 is more likely because it has a higher (less negative) log-likelihood value (-811 vs -814), indicating a better fit to the data.

New cards

You estimate a slope of -0.2 with a standard error of 0.15. What are the approximate 95% confidence intervals for the estimate of the slope? Is it positive or negative? Is it significant? Why?

LCI = -0.2 - 2(0.15) = -0.5

UCI = -0.2 + 2(0.15) = 0.1

This is not significant because the interval (-0.5 to 0.1) includes zero.

This is a negative effect due to the negative slope.

New cards

Here is an AICc table containing δAICc values for a set of models using bill length and species to explain variation in bird body mass.

Model selection results for models estimating mass as a function of species, bill length, both covariates, and no covariates (i.e, intercept only) for five species of birds captured near Missoula, MT, 2024-2026.

What is the best model?

The best model is the species-only model because it has the lowest AICc (δAICc = 0), indicating the best fit among the candidate models.

New cards

Here is an AICc table containing δAICc values for a set of models using bill length and species to explain variation in bird body mass.

Are the two covariates (bill length and species) “significant”? Why?

Species is definitely significant because it is the top model with the lowest AICc.
Bill length does not show strong evidence of a significant effect because every time it’s added to a model, the AICc worsens by 2 points.
The model with Bill Length alone performs very poorly as well, which can indicate that this covariate is not contributing much to the model.

New cards

Here is an AICc table containing δAICc values for a set of models using bill length and species to explain variation in bird body mass.

Can you tell whether they have a positive or negative effect? If so, how did you tell?

No, we can’t tell from this table.
AICc only tells us which model fits best, not the positive/negative effects.
In order to figure this out, we would need the intercept, which we could find through summary(lm).

New cards

<p>Here is a figure showing predictions from three models examining the probability that a point is a used point (from a wolf radio collar, not a random point) as a function of elevation (in meters), centered elevation (in meters), and z-standardized elevation (in standard deviations).</p><p>1) The intercepts for these three models are 7.23 (elevation), -2.64 (centered elevation), and -2.64 (z-standardized). Why are the intercepts for the centered and z-standardized covariates the same? </p>

Here is a figure showing predictions from three models examining the probability that a point is a used point (from a wolf radio collar, not a random point) as a function of elevation (in meters), centered elevation (in meters), and z-standardized elevation (in standard deviations).

1) The intercepts for these three models are 7.23 (elevation), -2.64 (centered elevation), and -2.64 (z-standardized). Why are the intercepts for the centered and z-standardized covariates the same?

Both centered and z-standardized variables are shifted so that the mean = 0
In both models, the intercept represents the predicted value at the mean elevation, resulting in the same intercept estimate.

New cards

2) The slopes for these three models are -0.005 (elevation), -0.005 (centered), and 2.079 (z-standardized). Why are the slopes for the elevation and centered covariates the same?

Centering only shifts the data — it does NOT stretch or compress it
The relationship (slope) stays identical
Only the intercept changes

New cards

3) Each model has the same log-likelihood and AICc. Why?

They are all technically using the same data, predictions, and parameters. The only thing that is changing is the scale.

New cards

what are the different kinds of hypotheses?

null [H_o] and alternate [H_a]

New cards

when do we reject the null?

when the p-value is less than 0.05

New cards

what is the equation for degrees of freedom?

df = n₁ + n₂ - 3

New cards

what does H_o represent in a two-tailed test?

there is no difference in mean

New cards

what does H_a represent in a two-tailed test?

there is a difference between means

New cards

what does H_o represent in a one-tailed test?

the mean of Group 1 is not greater than the mean of Group 2

New cards

what does H_a represent in a one-tailed test?

the mean of Group 1 is greater than the mean of Group 2

New cards

what is the difference between one-tailed and two-tailed tests?

Use a one-tailed test when you only care if a result is higher or lower, but not both.

Use a two-tailed test when you need to detect any significant effect regardless of direction.

<p>Use a one-tailed test when you only care if a result is higher <em>or</em> lower, but not both.</p><p>Use a two-tailed test when you need to detect any significant effect regardless of direction.</p>

New cards

what is a for-loop designed for?

to do something over and over again

New cards

why is there lots of grey area in creating groups?

hard to differentiate, can be based on opinions

ex: urban, suburban, semi-rural, rural

New cards

what does the likelihood function do?

tells us what the most likely line through a plot is

New cards

what does more variance correlate with?

more uncertainty

ex: if variance = 0.1, then the data will be organized

if variance = 1, the data will not be organized

New cards

what does y or x represent?

data (something we have already measured)

New cards

what do Greek letters represent?

parameters (something we need to estimate with uncertainty)

New cards

what is the dependent variable called?

the response (y)

New cards

what is the independent variable called?

covariate (x)

New cards

how does sample size effect uncertainty?

certainty increases when sample size increases

New cards

what is a binomial?

has 2 outcomes (like flipping a coin)

New cards

what is a multinomial?

has multiple outcomes (like rolling a die)

New cards

what do link functions do?

they allow us to constrain parameters (boundaries) so we don’t predict things like 150% survival rate

New cards

ln(N)

New cards

ln(1)

New cards

ln(e)

New cards

e^x

New cards

e⁰

New cards

what are 95% confidence intervals?

a range of values calculated from sample data that is likely to contain the true population parameter 95% of the time if the study were repeated

New cards

what does the standard error measure?

how accurately a sample mean represents the true population mean, indicating how much the average would vary if you repeated a study with new samples

New cards

what does the null hypothesis state?

there is no difference between groups

New cards

what does the alternate hypothesis state?

there is a difference between groups

New cards

why is representative sampling important?

data will not be accurate if it only represents one aspect of the hypothesis/population

100

New cards

what would a linear graph look like if the null hypothesis was true?

a straight line with no slope