1/23
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What are the population parameter and sample statistic for the mean?
mu (greek letter) and X^bar
What are the population parameter and sample statistic for the standard deviation and variance?
population parameters Var (sigma²) and sigma
sample statistics s² and s
What is the difference between descriptive and inferential statistics?
Descriptive statistics summarise and describe the characteristics of a data set, while inferential statistics use sample data to make inferences or predictions about a population.
What are the two main types of quantitative and qualitative data?
The two main types of quantitative data are discrete and continuous, while qualitative data can be classified as nominal and ordinal.
What is the difference between discrete and continuous data?
Discrete data consists of distinct values (countable, whole numbers), while continuous data can take any infinite value within a given range, inc. fraction or decimal (measurable).
What is the difference between cross-sectional, time-series, and longitudinal data?
Cross-sectional = one point in time, multiple units of interest
Time-series = one unit of interest collected over successive points in time
Longitudinal = repeated observations of the same subjects over a long time
What happens to the mean, median and mode over the following graphs: right-skewed, left-skewed, bell-curve, symmetrically-uniform
Right skewed - mean > median
Left skewed - mean < median
Bell-curve - mean=median=mode
Uniform - mean = median (all same x, no mode)
Percentiles calculate what?
The percentage of values AT OR BELOW a given value
The 50th percentile is interchangeable with what other two descriptive statistics?
The 2nd quartile and the median
What is the IQR and how do you get it?
A measure of the spread of the middle 50% of data
= Q3 (75th percentile) - Q1 (25th percentile)
How are the whiskers of a boxplot calculated?
(Q3 + 1.5xIQR) and (Q1 - 1.5xIQR)
What are the 3 main types of statistics?
Measures of central tendency (mean, median, mode)
Measures of relative standing (percentiles, quartiles, IQR)
Measures of variability (range, variance, standard deviation, CV)
What is the coefficient of variation (CV)? (aka what does it show)
the ratio of the standard deviation to the mean (as a percentage) - on formula sheet x
shows the spread of data relative to the mean, allowing for comparison of variability across different units of scale!!
What’s the difference between covariance and the correlation coefficient?
Covariance measyres the direction of association between two variables whereas correlation coefficient is a standardised/unitless version that measures the direction and also strength of the relationship
What’s the difference between nominal and ordinal data?
nominal = labelling, ordinal = rank order
How do you test if two events are independent?
Their joint probability = same as multiplying their marginal probabilities
What are the units of standard deviation and variance?
variance = squared units of x
standard deviation = same units as x
What’s the difference between bivariate and binomial probability distributions?
Bivariate = two random variables happening at the same time - joint and marginal probability tables, calculating conditional from there etc
Binomial = models independent trials that have only two outcomes - success or failure
What happens to conditional probability P(Y = y | X = x) and E(Y | X = x) if X and Y are independent?
P = (Y = y) and E = E(Y)
What is the definition of a normal distribution? (aka by what two parameters is it defined and how do those impact the distribution?
X ~ N(mu, sigma²)
Mean → location, shifts left and right
Variance → spread, changes the shape, the higher Var the wider and flatter the curve
How much probability mass is kept within mu ± 1, 2, and 3 standard deviations according to the 6 sigma rule?
mu ± sigma = 68.27%
mu ± 2sigma = 94.45%
mu ± 3sigma = 99.73%
What happens to the t distribution as the degrees of freedom changes?
As degrees of freedom increases, the tails slim down (aka there are less outliers) and the distribution approaches that of a standard normal. There is thrf also higher concentration around the mean.
What makes a sample “good” (of which avoids systemic bias)?
1) Representative
2) Random
3) Independent
4) Adequate Size