Stats midterm- Units 5-6

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/22

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

23 Terms

New cards

Random trial

A process that has multiple outcomes where the result on any particular trial is unknown

New cards

Sample space

Set of all possible outcomes. Shown with {a,b,v,c}

New cards

Event

The outcome we’re interested in. Show by E = {tail} or E = {2,3,4}

New cards

Discrete vs continuous variables

In the context of random trials can be for discrete or continuous variables

New cards

Random trials and sampling

Random trial is the act of selecting a sampling unit and taking a measurement. sample set is {more than 5 mins, less than 5 mins}… E = {more than 5 mins}

New cards

Probability of an event

proportion of times that an event would occur if a random trial was repeated a large amount of times

New cards

Probability and law of large numbers

With a smaller number of trials, there is more variation of the probabilty, but over with more trials, the probability converges on a single proportion (variation between trials becomes less significant)

New cards

Probability distributions

functions that describe probability over a range of events. Show the probability of observing an outcome within a range ov events as the area under the curve

describe probability for entire sample space
the area under the curve always equals one
can be used to describe continuous and discrete random variables

New cards

Probability distributions for discrete variables

Shown as series of vertical bars, no space between them. Each event gets a separate bar, area of the bar = probability of that event. Vertical axis = probability mass

New cards

Probability distribution for continuous distribution

a single curve, the area under a part of the curve is the probability for events within a specific range. Vertical axis = probability density. If the range is 0, the probability is also 0. “Whats the probability of someone having waited exactly 5 minutes?” zero.

New cards

Calculating probability from a distribution

Area under the distribution curve for a given range

New cards

Range from a distribution

What range contains p=____? (eg. what daily minimum depth of snow do we expect 50% of the time during the winter")

New cards

Standard Normal Distribution

a normal distribution with a mean of 0, standard deviation of 1, and the x axis is the z score (how many standard deviations the value is from the mean)

to convert your probabilities to standard normal distribution:
- (value-mean)/standard deviation = z score
- look at z-score distribution to find probability of eent range

New cards

If you’re trying to find a range based on a probability

find the z-score of the probabilitiy (x-0)/1
Set that equal to zscore of your sample to find the threshold number

New cards

Descriptive statistics

describe the characteristics of a specific sample for each measurement variable = you can make statements that apply to just your sample.

each measurement variable has its own set of descriptive statistics
descriptive statistics are any quantifyable characteristic of a sample
they’re labeled using latin alphabet (p,m,s)

New cards

Population parameters

describe sampling population… quantifiable characteristic of a statistical population (eg. average of entire statistical population)

population parameter = fixed value
each measurement variable has its own set of population parameters
descriptive statistics = not fixed (2 different samples have different mean values)

New cards

Estimation

descriptive statistics provide an estimate of the population parameter.

estimating =

New cards

Sampling distribution

probability distribution of a descriptive statistic (like mean) that would emerge if a statistical population was sampled repeatedly a large number of times

can do it for any descriptive sample
Shape of sample distribution is independant of statistical population as long as the sample is big enough (will be a bell even if the distribution is spiky or whatever)
As sample size increases, variance between groups decreases (more likely to be concentrated around the mean of the statistical population)

New cards

Central limit theorum

sampling distribution tends towards normal distribution as sample size gets larger
the mean of the sampling distribution is the same as the mean of the statistical population
Has standard error (standard deviation) of standard deviation of stat.pop/root sample size
BUT assumes we know statistical population perfectly… but we have to estimate standard distribution

New cards

Standard error

Standard deviation of a sampling distribution (SE = standard deviation of statistical population/root(samplesize))

New cards

Chain of inference

statistical population and sampling distribution are not directly observed, only the sample.
observation of sample > infer about statistical population parameters > estimate sampling distribution

BUT assumes we know statistical population perfectly… but we have to estimate standard distribution

New cards

Student’s T distribution

looks like a normal distribution, but with fatter tails to account for larger uncertainty

as sample size increases, looks more like normal distribution
basically to accomodate chain of inference because we don’t know the standard deviation of the statistical population - use our sample to estimate it but could be wrong, meaning our standard error could be wrong too.

New cards

Confidence intervals

describe range over x-axis of a sampling distribution that brackets a certain probability of where new samples may be found with ___ amount of certainty.

estimate of how much uncertainty there is due to variation from sampling error

can show with dots, and the lines through them = confidence interval

start at the middle, move the lines apart until yo get the probability/% of certainty you’re looking for
find interval (left and right t scores) that brackets that probability
convert t scale back to raw scale (t = x-mean/standard deviation or error)