1/26
weeks 5-9
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Probability and Statistical inference
Probability matters in stats - probability provides a way of quantifying uncertainty and is the foundation of statistical inference
Why probability matters?
Sample results vary because observations depend on which individuals are included
Probability provides framework for understanding and modelling this variability
Statistical inference uses probability to link sample data to population conclusions
Parameters, Statistical and uncertainty
All samples statistics involve uncertainty because of sampling variation
Sample distributions
A sample distribution describes how a statistic varies across repeated random samples from the same population
it is a distribution of statistics not raw data
it is usually theoretical or stimulated
The sampling distribution of the sampling Mean: how is it formed?
take a large number of random samples of the same size (n) from the same population
calculate the mean for each sample
plot all of those sample means
Key properties of sampling distribution
Centre: the mean of the sampling distribution equals the population mean
Spread: the variability of the sampling distribution is smaller than the variability of individual scores
Shape: as sample size increases, the sampling distribution becomes approximately normal
Standard error: measuring uncertainty in estimates (SEM)
The standard error quantifies the variability of a statistic across repeated samples
the standard error of the mean (SEM) desribes how much sample means typically differ from the population mean
what is SEM formula?
SD / square root of n
Key ideas
SD = describes variability in individual scores
SEM = describes variability in sample means - decreases as sample size increases
larger SEM = more certainty in the estimate
smaller SEM = more precise estimate
Sample size and precision
larger samples reduce sampling variation
larger samples lead to smaller standard errors
larger samples produce more precise estimates of population parameters
why estimation requires probability
model how samples behave when drawn at random from a population
quantify uncertainty in our estimates
make statements about plausible population values
what is correlational research
correlational research examines the relationship between two or more measured variables, focusing on how they co-vary
association claims and
most correlational studies make association claims, not causal claims
3 criteria for causation
covariation
temporal precedence
elimination of alternative explanations
correlation coefficients ( r )
the correlation coefficients describes the strength and direction of the relationship between two variables
Key properties
Range: r ranges from -1 to +1
Direction: positive = as one variable increases, the other increases, negative = as one variable increases, the other decreases
approximate benchmarks:
Small: r= .10
medium r= .30
large r= .50
what do scatterplots show?
direction of the relationship
strength of the relationship
shape of the relationship
presence of outliers
threats to statistical validity
outliers
restriction of range
curvilinear relationship
confidence intervals for correlations
confidence intervals (CIs) provides information
a 95% confidence intervals for a correlation
indicates a range of plausible population correlation values
reflects uncertainty due to sampling variability
interpretation of CIs
narrow CI = more precise estimate
wide CI = less precise estimate
Association claims
association claims should be evaluated using mulitple forms of validity
construct validity
concerned with how well the variables are measured and defined
statistical validity
concerned with whether the statistical conclusions are accurate and reasonable
external validity
concerned with whether the findings generalise beyond the study
internal validity
concerned with whether a causal conclusion can be made
key features of simple linear regression
describes the relationship using an equation
predicts values of one variable from another
identifies the line of best fit
what is intercept (bo) and slope (B1)
intercept: predicted value of y when X = 0
slope: expected change in y for a one-unit increase in X