1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What are the three main questions statistical inferences address?
1) Validity of a given result, 2) How significant?, 3) How close is observed result to an expected result?
What is a statistical population?
A large body of data containing all the data points of interest
What is a statistical sample?
Smaller collections of data points that are drawn from the much larger population
What do we use samples for?
To infer or estimate things about the population
What is relative frequency?
Proportion of particular observation relative to the total number of observations
What formula represents relative frequency?
Relative frequency = (frequency of specific observation) / (total number of observations)
How do scientists make inferences about populations?
Scientists make inferences about populations based on information obtained from smaller samples
What can be calculated from samples?
Event frequencies (and relative frequencies) can be calculated from samples
How can frequency distributions of samples be used?
Frequency distributions of samples can be used to predict the frequency distributions of populations
What happens to frequency distributions as sample size increases?
As sample size becomes larger, more likely to approximate a bell shape curve
What type of distribution approximates many frequency distributions in biology?
The normal distribution
Where is the normal distribution centered?
Around the mean/average value (x̄)
What percentage of data is on each side of the mean in a normal distribution?
50% right, 50% left
What does standard deviation measure?
Measure of the average deviation of data points from the sample mean (s)
What percentage of values lie within x̄ ± s in a normal distribution?
~68% of all values
What percentage of values lie within x̄ ± 2s in a normal distribution?
~95% of all values
What percentage of values lie within x̄ ± 3s in a normal distribution?
Nearly all of the measurements
How can area under the normal distribution curve be interpreted?
Area under the curve can be treated as probabilities
What is the total area under a normal distribution curve?
p = 1.0 (total area equals 1)
What is the normal distribution?
A type of probability distribution that applies to many events and phenomena in nature
What can standard deviation be used to analyze?
The numerical spread of a dataset, and to set specific boundary values underneath a curve
How can area underneath relative frequency distribution curves be interpreted?
As probabilities
What is the basic difference between statistical sample and statistical population?
Statistical population is a large body of data containing all data points of interest; statistical sample is a smaller portion of the total statistical population
What are frequency histograms?
Show data points that are placed into bins of equal sizes; histograms can either show absolute or relative frequencies
What do statistical tests involve the use of?
Probability distributions (one example is a normal distribution)
What type of distribution do many natural phenomena follow?
Normal distribution - mound shaped distribution for continuous variables
What does the interval x̄ ± s contain?
~68% of measurements
What does the interval x̄ ± 2s contain?
~95% of measurements
What does the interval x̄ ± 3s contain?
Nearly all of the measurements
What does area under the normal distribution curve represent?
Probability, where the total area = 1
What is a null hypothesis (H₀)?
A stated hypothesis which says there is no difference or no effect
What is an alternative hypothesis (Hₐ)?
A hypothesis which says that there is a difference
What will a statistical test do with the null hypothesis?
Either reject, or fail to reject, the stated null hypothesis
What is a goodness of fit statistical test?
A particular type of test that allows us to determine whether or not an observed result is consistent with an expected result
What is a specific example of a goodness of fit test?
Chi-square goodness of fit test
What is the chi-square goodness of fit test used for?
To detect the goodness of fit for data that fit into several predefined 'categories' such as different genotypes
What is the chi-square distribution also known as?
A probability distribution function
What determines the shape of the chi-square distribution function?
The number of categories 'k' being considered
How should area under any chi-square curve be treated?
As a probability, where total area equals a probability of 1
What p-value is typically used to reject the null hypothesis?
p = 0.05
What must the chi-square statistic be to reject the null hypothesis?
At least as high as the critical values at p = 0.05
What determines the number of degrees of freedom?
Number of categories 'k'
How are degrees of freedom calculated?
k - 1 (number of categories minus 1)
How many degrees of freedom are used for Hardy-Weinberg questions?
Always use 1 degree of freedom (row one)
For two alleles and 3 different genotypes, how many degrees of freedom are used?
One degree of freedom
What alpha value corresponds to the chi-square threshold value?
0.05 or less
When can you reject the null hypothesis in chi-square tests?
If calculated chi-square value is larger than threshold value
In chi-square goodness of fit tests for Hardy-Weinberg questions, what do the observed (O) and expected (E) values refer to?
The numbers of organisms, NOT the genotype frequencies
What shape does the normal distribution have?
Bell-shaped or mound-shaped curve
What makes a sample representative for statistical inference?
Being a smaller collection drawn from the larger population that can be used to estimate characteristics of that population
What is the relationship between sample size and normal distribution approximation?
Larger sample sizes are more likely to approximate a bell-shaped curve
Why is standard deviation important in normal distributions?
It allows you to determine what percentage of data falls within specific ranges from the mean
What does it mean when we say area under curves represents probability?
The proportion of area under a curve segment corresponds to the probability of values falling in that range