1/28
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Define and give an example of a Null Hypothesis
Reflects the “status quo” or “nothing of interest”, it is common to express the null hypothesis in the simplest form. States that there is no difference between two groups, such as "a new drug has no effect on blood pressure" or "the average salary of male and female factory workers is the same"
What is the difference between a Type I and Type II error
Type 1: reject null hypothesis when it is true and should not be rejected
Type 2: fail to reject null hypothesis when it is false and should be rejected
Define Population and Sample
The entire group of individuals that can be samples
Sample is the subset of that population
List the four ways to sample and explain one of them
Simple random - group of individuals and randomly take some from group
Cluster sample
Systematic sampling
stratified random
Define Variable
Characteristics, whose values can change from one member of the population to another one
Been able to identify independent and dependent variables from a scenario [What is the difference between explanatory (Independent) and response Variables)
Variables that explain or predict the variation in the response variable (independant)
Response (or target) variables. Variables that are the main focus of the story. E.g. number of trees, plant height, water quality, number of the eggs of fledgling
What are the five metrics that make up the five-summary?
Minumum value, first quartile, second quartile, third quartile, maximum value
What is the difference between Sample variance and Sample Standard Deviation?
Variance is how the sample is spread out around the entire dataset, std. Dev is around the mean
Define mean, median and mode
Mean- average
Median - middle number
Mode- value that occurs most frequently
List and describe the two assumptions for performing most statistical procedures
The data is normally distributed and the variances are homogeneous
Name two ways you can do a visual check for normality?
Histograms, Q-Q plots
Name the test to formally check for heteroscedasticity
Levene’s test
What is the difference between the normal and log-normal distribution?
Log normal can’t have negative numbers, starts at zero so can be very 0 heavy. Only can have positive real numbers.
What is the difference between a one sided and two sided hypothesis?
A one-sided hypothesis predicts an effect in a single direction (e.g., greater than or less than), while a two-sided hypothesis checks for a difference in either direction (e.g., not equal to)
How do we interpret p-values as it relates to the null hypothesis?
If less than 0.05, reject the null hypothesis. If greater than 0.05 fail to reject the null hypothesis.
Define Degrees of freedom and the impact they have on an analysis
Degrees of freedom (df) are the number of independent values in a calculation that are free to vary. They are crucial in data analysis because they influence the shape of probability distributions, like the t-distribution, which in turn affects the critical values used in hypothesis testing and the accuracy of statistical conclusions
Describe Residuals and the impact they have on an analysis
Difference between the observed and the predicted, goal is to have smaller residuals
Identify whether a figure of a line graph is either a positive, negative and no correlation.
Positive goes up from left to right, negative goes down from left to right, no correlation has a straight horizontal line.
What’s the difference between a two-sample t-test and a paired t-test?
Two sample is two groups that are unrelated (burned or not burned), paired test is when there is a relationship between those two groups.
What does it means when 95% CI for group means overlap on a graph?
No significant difference
Define a model
Some representation of the natural world that ideally represent some hypothesis about how our system works.
Describe Occam’s Razor as it relates to models
“Entities should not be multiples without necessity”
Give a least one reason why you would use either AICc or QAIC
For AICc is good for small sample sizes, QAIC is good for overdispersion
Identify the type of response and predictor variables for regression analysis
In regression analysis, the response variable is the outcome you want to predict (also called the dependent or yy 𝑦-variable), while the predictor variables are the factors you use to make the prediction (also called independent or xx 𝑥-variables). The response variable is typically a continuous measurement, but the predictor variables can be continuous or categorical.
Describe the relationship between AIC and information loss
AIC is the relative amount of information lost by a given model
Information loss: Lower the loss, higher the quality of the model
When comparing models with AIC, How do you know which model is the best model
Lowest AIC
What are the type of response and predictor variables for Analysis of Variance (ANOVA)?
Continuous, categorical
What’s the difference between a one way and two-way ANOVA?
One way anova has one factor, two way anova has two factors
What is a posthoc test?
a statistical tool used after a significant result from a one-way Analysis of Variance (ANOVA) (or its nonparametric equivalent) to identify which specific groups differ significantly from one another