BIOL 300 Midterm Lecture

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/442

flashcard set

Earn XP

Description and Tags

L1-L14, UBC

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

443 Terms

1
New cards
What are the two broad goals of statistics?
(1) Estimate the values of important parameters, (2) Test hypotheses about those parameters
2
New cards
Define a population in statistics.
The entire collection of individual units that a researcher is interested in
3
New cards
Define a parameter.
A characteristic of a population
4
New cards
Define a sample.
A subset of individuals from the population, used to estimate parameters
5
New cards
Define a variable.
A characteristic measured on individuals drawn from a population under study
6
New cards
Define data.
Measurements of one or more variables made on a collection of individuals
7
New cards
What is bias?
The tendency of a measurement process to over- or under-estimate the true population value
8
New cards
Volunteers are recruited for a study. Why might estimates from this sample be biased?
Volunteers are likely systematically different from the overall population
9
New cards
Define accuracy in statistics.
Accuracy = unbiased; estimates on average are centered on the true parameter
10
New cards
Define precision in statistics.
Precision = consistency; estimates give similar answers repeatedly
11
New cards
Can a sample be precise but inaccurate? Explain.
Yes — estimates could be tightly clustered but systematically biased away from the true parameter
12
New cards
List three properties of a good sample.
(1) Sufficiently large, (2) Equal probability of inclusion, (3) Independent selection
13
New cards
If a sample is large but all individuals come from the same neighborhood, which property is violated?
Independence (and possibly representativeness of the population)
14
New cards
What is a response variable vs an explanatory variable?
Response = outcome measured; Explanatory = variable used to predict or explain the response
15
New cards
In a study on exercise and cholesterol, which is explanatory and which is response?
Explanatory = exercise, Response = cholesterol level
16
New cards
What is independent sampling?
The chance of an individual being included in the sample does not depend on who else is sampled, and measurement of one sample does not influence another
17
New cards
Give an example of non-independence in sampling.
Catching one fish disturbs the water, affecting whether others are captured
18
New cards
Give another example of non-independence in sampling.
Collecting data from people in the same household; their responses may be similar due to shared environment
19
New cards
What is a repeated measures design?
Repeatedly measuring the same individual over time, such as tracking physiological response to a drug
20
New cards
What is a population parameter?
A fixed constant that describes the truth about the entire population
21
New cards
What is a sample estimate?
A random variable that changes depending on which sample is collected
22
New cards
What is absolute frequency?
The raw count of how many individuals fall into a category or bin
23
New cards
What is relative frequency?
The proportion or percentage of individuals in a category
24
New cards
Define sampling error.
Random difference between a sample estimate and the population parameter, caused by chance
25
New cards
Define sampling bias.
Systematic error from how a sample is collected, leading estimates away from the true population value
26
New cards
Contrast bias vs error.
Bias = systematic tilt, Error = random noise
27
New cards
How does sample size affect sampling error?
Larger samples shrink sampling error but cannot eliminate it completely
28
New cards
What are the four reasons estimates do not equal parameters?
Measurement bias, Sampling bias, Measurement error, Sampling error
29
New cards
Give an example of measurement bias.
Measuring people’s height with shoes on
30
New cards
Give an example of measurement error.
Small random slips when using a tape measure
31
New cards
Give an example of sampling bias.
Sampling too many basketball players in a student height study
32
New cards
Why are graphs important in statistics?
They reveal patterns in data and prevent misinterpretation from numbers alone
33
New cards
What are the two main types of variables?
Categorical (nominal, ordinal) and numerical (discrete, continuous)
34
New cards
What graph is best for categorical variables?
Bar chart (most common) or pie chart for proportions
35
New cards
What graph is best for numerical variables?
Histogram, CDF, or boxplot
36
New cards
What is a histogram?
A plot of numerical data grouped into bins with bars that touch
37
New cards
What are quantiles?
Cut points dividing data into equal portions
38
New cards
What are quartiles?
Q1 = 25th percentile, Q2 = median, Q3 = 75th percentile
39
New cards
What does a CDF show?
The cumulative proportion of data values ≤ a given point
40
New cards
What is a contingency table used for?
Showing associations between two categorical variables
41
New cards
What is a mosaic plot?
Graphical representation of a contingency table where rectangle areas correspond to frequencies
42
New cards
What graph is best for comparing numerical variables between categories?
Boxplot, violin plot, or multiple histograms
43
New cards
What graph is best for association between two numerical variables?
Scatter plot
44
New cards
What graph is best for trends over time?
Line graph
45
New cards
What are the two pillars of describing numerical data?
Location (central tendency) and width (spread)
46
New cards
What are the three common measures of location?
Mean, median, mode
47
New cards
What is the sample mean formula?
Ȳ = (ΣYi) / n
48
New cards
What is the population mean symbol?
μ
49
New cards
Define median.
The middle value when data are ordered; splits data into two equal halves
50
New cards
When is the mean preferred over the median?
For symmetric, bell-shaped data without extreme outliers
51
New cards
When is the median preferred over the mean?
For skewed data or when outliers are present
52
New cards
What is mode?
The most frequent value; unimodal, bimodal, or multimodal
53
New cards
How do mean, median, and mode relate in a symmetric distribution?
Mean ≈ Median ≈ Mode
54
New cards
How do they relate in a right-skewed distribution?
Mode < Median < Mean
55
New cards
How do they relate in a left-skewed distribution?
Mean < Median < Mode
56
New cards
What are the main measures of spread?
Range, variance, standard deviation, interquartile range, coefficient of variation
57
New cards
Define range.
Max – Min
58
New cards
Define interquartile range (IQR).
Q3 – Q1, the middle 50% of the data
59
New cards
When is a value often considered an outlier using IQR?
If it lies more than 1.5 × IQR below Q1 or above Q3
60
New cards
Define variance.
The average squared deviation from the mean
61
New cards
What is the difference between population and sample variance formulas?
Population divides by N, sample divides by n–1
62
New cards
Define standard deviation.
The square root of variance; spread in the same units as data
63
New cards
What does the empirical rule say for a bell-shaped distribution?
~68% within 1 SD, ~95% within 2 SDs
64
New cards
Define coefficient of variation (CV).
SD divided by mean, expressed as a percentage
65
New cards
Why is CV useful?
It allows comparison of variability across datasets with different scales
66
New cards
What happens to the mean if a constant is added to all values?
The mean shifts by that constant
67
New cards
What happens to variance if a constant is added?
Variance does not change
68
New cards
What happens to the mean if all values are multiplied by c?
The mean is multiplied by c
69
New cards
What happens to the variance if all values are multiplied by c?
Variance is multiplied by c²
70
New cards
What two factors determine how precise a sample mean is?
Sample size (n) and population spread (σ)
71
New cards
How does sample size affect precision of the mean?
Larger n → less variation in sample means
72
New cards
How does population spread affect precision of the mean?
Larger σ → more variation in sample means
73
New cards
What is the most precise sampling distribution?
Large sample size and small population spread
74
New cards
What is the least precise sampling distribution?
Small sample size and large population spread
75
New cards
What is the difference between standard deviation and standard error?
SD = spread of raw data; SE = spread of sample means
76
New cards
Define standard error of the mean.
The standard deviation of the sampling distribution of the sample mean
77
New cards
What does the standard error predict?
The sampling error of the estimate
78
New cards
Define population distribution, sample mean, and sampling distribution.
Population distribution = all raw data; Sample mean = average of one sample; Sampling distribution = distribution of means from repeated samples
79
New cards
How does the mean of sample means relate to the population mean?
The mean of sample means equals the population mean (unbiased)
80
New cards
What is the formula for the estimated SE of the mean?
SE = s / √n
81
New cards
Define a 95% confidence interval.
A range of plausible values for the true population mean
82
New cards
Shortcut method for 95% CI?
Sample mean ± 2 SE
83
New cards
How does increasing n affect confidence intervals?
Bigger n → narrower CI → more precise estimate
84
New cards
Why can’t you say “there is a 95% probability the mean is in this CI”?
µ is fixed; probability doesn’t apply. Instead, 95% of CIs from repeated samples will capture µ
85
New cards
Define pseudoreplication.
Treating non-independent measurements as if they were independent
86
New cards
Why is pseudoreplication a problem?
It underestimates SE, leading to misleading CIs and hypothesis tests
87
New cards
Example of pseudoreplication.
Measuring 10 cells from 5 mice and treating them as n=50 instead of n=5
88
New cards
Define a proportion.
Number with attribute / total (n), range 0–1
89
New cards
Define probability.
Long-run relative frequency of an event if repeated infinitely
90
New cards
What is the difference between probability and proportion?
Probability = true population value; Proportion = estimate from a sample
91
New cards
Define mutually exclusive events.
Events that cannot both occur; P(A and B) = 0
92
New cards
Define independent events.
Events where occurrence of one gives no information about the other
93
New cards
What is a probability distribution?
Describes the relative frequency of all possible outcomes; probabilities sum to 1
94
New cards
Formula for complement rule.
P(Not A) = 1 – P(A)
95
New cards
How are discrete and continuous probability distributions different?
Discrete = probability at individual outcomes; Continuous = probability only defined over ranges (single values have probability 0)
96
New cards
In probability, what does “or” mean?
Inclusive: A, B, or both
97
New cards
Formula for general addition rule.
P(A or B) = P(A) + P(B) – P(A and B)
98
New cards
Addition rule if A and B are mutually exclusive.
P(A or B) = P(A) + P(B)
99
New cards
Formula for multiplication rule if A and B are independent.
P(A and B) = P(A) × P(B)
100
New cards
What tool can visualize conditional probability calculations?
Probability trees