1/41
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
When should a Bar Plot be used?
- The data is normally distributed
- Comparing means
When should a Means Plot be used?
- Data is ranked or quantitative
- Comparing means
When Should a Box Plot be used?
- Data is skewed or has outliers
- Comparing medians
When should a Scatter Plot be used?
- Relationship between quantitative variables
- Checking assumptions for linear models
Frequency Distribution
A list of values in a dataset and how often they occur
- Grouped into bins for a histogram
- Smooth density curve
Density Curve
- The area under the curve represents probability and may be called a probability density function.
- Enables us to calculate the probability of a certain range of data values.
Normal distribution
Can be described by mu (population mean) and sigma (standard deviation)
Central Limit Theorem (CLT)
The averages calculated from independent, random samples will approach a normal distribution as sample size increases, regardless of the population distribution from which the samples are taken
Standard error of the means
Measures the precision of a sample mean as an estimate of the true population mean, representing the standard deviation of the sampling distribution of the mean
Kurtosis
Quantifies tailedness of a distribution
Standard normal Distribution
Symmetric, bell-shaped probibility distribution with a mean of 0. Also called a z-distribution.
Characteristics of a standard normal distribution
x = 0, SD = 1, skew = 0, kurtosis = 3, and area = 1
What percentage is within 1 SD
68%
What percentage is within 2 SD
95%
What percentage is within 3SD
99.7%
Advantages of standard normal distribution
1) Easier to calculate probabilities
2) Can directly compare different distributions
What is a z-score?
Measures a value relationship to the mean of a group in terms of standard deviation
Confidence Interval
Calculates margin or error around a statistic
Standard deviation
Communicates distribution of a symmetrical dataset; not influenced by sample size and good for illustrating data structure
Standard error
Communicate probability that the population mean falls within a given range; bars indicate range with 68%; Gets smaller as n increases
Confidence Interval
Communicates probability the population mean falls within a given range; bars indicate 90%, 95%, or 99% chance
Margin of error
Degree of uncertainty
Null Hypothesis
The hypothesis tested when performing significance test
Alternate Hypothesis
One or two sided hypotheses that differ from the null hypothesis
Whis hypothesis uses =
Null
What hypothesis uses ≠
Two sided alternative hypothesis
What hypothesis uses < or >
One sided alternative hypothesis
Steps in a Z-test
1) State Ho & Ha
2) Calculate test statistic that shows how “compatible,” the collected data are with the Ho
3) Based on how “compatible” the data are with the null, we form a conclusion about the null hypothesis
4) Clearly state conclusion
p > alpha
Fail to reject Ho
p <= alpha
Reject Ho
What does a conclusion look like?
We have evidence to ________ the null hypothesis
Biological Hypothesis
Clearly state proposed description of how a system works that includes
1) Biological Mechanism
2) Testable statement of cause and effect
Biological Prediction
Follows logically from hypothesis
1) Explain what will happen under conditions set
2) Measureable dependent variable
Biological vs. Statistical hypothesis
Biological is the intellecual driver behind research while statistical is the tool to provide logical framework for data analysis
Statistical hypothesis
Testable claims under a population parameter that are evaluated using sample data that involves two mutually exclusive statements (Ho & Ha)
Types of error
Type 1: Rejecting Ho that is true
- Frequency set by alpha
Type 2: Failing to reject Ho that is false
- Frequency set by beta
Why do we use statistics?
To help us sort among hypotheses to find the best explanations of what we are observing
t vs z distribution
t is used when the population standard deviation is unknown
What is a one-sample t-test
1) Compares a sample mean to an expected mean
2) population standard deviation is unknown
Assumptions met during a one-sample t-test
1) Data is independent
2) Data is random
3) Data is normally distributed (Histogram & QQ plot)
t-test qualifications for sample size
n < 15; even distribution
15 <= n < 40; Slight skew
n >= 40; there are no outliers
Non-parametric testing
Shapiro-wilks test; the closer to 1 the more normal it is