1/22
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Random trial
A process that has multiple outcomes where the result on any particular trial is unknown
Sample space
Set of all possible outcomes. Shown with {a,b,v,c}
Event
The outcome we’re interested in. Show by E = {tail} or E = {2,3,4}
Discrete vs continuous variables
In the context of random trials can be for discrete or continuous variables
Random trials and sampling
Random trial is the act of selecting a sampling unit and taking a measurement. sample set is {more than 5 mins, less than 5 mins}… E = {more than 5 mins}
Probability of an event
proportion of times that an event would occur if a random trial was repeated a large amount of times
Probability and law of large numbers
With a smaller number of trials, there is more variation of the probabilty, but over with more trials, the probability converges on a single proportion (variation between trials becomes less significant)
Probability distributions
functions that describe probability over a range of events. Show the probability of observing an outcome within a range ov events as the area under the curve
describe probability for entire sample space
the area under the curve always equals one
can be used to describe continuous and discrete random variables
Probability distributions for discrete variables
Shown as series of vertical bars, no space between them. Each event gets a separate bar, area of the bar = probability of that event. Vertical axis = probability mass
Probability distribution for continuous distribution
a single curve, the area under a part of the curve is the probability for events within a specific range. Vertical axis = probability density. If the range is 0, the probability is also 0. “Whats the probability of someone having waited exactly 5 minutes?” zero.
Calculating probability from a distribution
Area under the distribution curve for a given range
Range from a distribution
What range contains p=____? (eg. what daily minimum depth of snow do we expect 50% of the time during the winter")
Standard Normal Distribution
a normal distribution with a mean of 0, standard deviation of 1, and the x axis is the z score (how many standard deviations the value is from the mean)
to convert your probabilities to standard normal distribution:
(value-mean)/standard deviation = z score
look at z-score distribution to find probability of eent range
If you’re trying to find a range based on a probability
find the z-score of the probabilitiy (x-0)/1
Set that equal to zscore of your sample to find the threshold number
Descriptive statistics
describe the characteristics of a specific sample for each measurement variable = you can make statements that apply to just your sample.
each measurement variable has its own set of descriptive statistics
descriptive statistics are any quantifyable characteristic of a sample
they’re labeled using latin alphabet (p,m,s)
Population parameters
describe sampling population… quantifiable characteristic of a statistical population (eg. average of entire statistical population)
population parameter = fixed value
each measurement variable has its own set of population parameters
descriptive statistics = not fixed (2 different samples have different mean values)
Estimation
descriptive statistics provide an estimate of the population parameter.
estimating =
Sampling distribution
probability distribution of a descriptive statistic (like mean) that would emerge if a statistical population was sampled repeatedly a large number of times
can do it for any descriptive sample
Shape of sample distribution is independant of statistical population as long as the sample is big enough (will be a bell even if the distribution is spiky or whatever)
As sample size increases, variance between groups decreases (more likely to be concentrated around the mean of the statistical population)
Central limit theorum
sampling distribution tends towards normal distribution as sample size gets larger
the mean of the sampling distribution is the same as the mean of the statistical population
Has standard error (standard deviation) of standard deviation of stat.pop/root sample size
BUT assumes we know statistical population perfectly… but we have to estimate standard distribution
Standard error
Standard deviation of a sampling distribution (SE = standard deviation of statistical population/root(samplesize))
Chain of inference
statistical population and sampling distribution are not directly observed, only the sample.
observation of sample > infer about statistical population parameters > estimate sampling distribution
BUT assumes we know statistical population perfectly… but we have to estimate standard distribution
Student’s T distribution
looks like a normal distribution, but with fatter tails to account for larger uncertainty
as sample size increases, looks more like normal distribution
basically to accomodate chain of inference because we don’t know the standard deviation of the statistical population - use our sample to estimate it but could be wrong, meaning our standard error could be wrong too.
Confidence intervals
describe range over x-axis of a sampling distribution that brackets a certain probability of where new samples may be found with ___ amount of certainty.
estimate of how much uncertainty there is due to variation from sampling error
can show with dots, and the lines through them = confidence interval
start at the middle, move the lines apart until yo get the probability/% of certainty you’re looking for
find interval (left and right t scores) that brackets that probability
convert t scale back to raw scale (t = x-mean/standard deviation or error)