Stat 250 final

0.0(0)

Studied by 5 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/187

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

188 Terms

New cards

data

observations gathered for analysis; numerical or non-numerical

New cards

cases

subjects we obtain data about

New cards

when making comparisons, we would like to

determine association and determine causation

New cards

statistical inference

the process of using data from a sample to gain information about the population

New cards

sampling bias

occurs when the method of selecting a sample causes the sample to differ from the population in some relevant way

New cards

we try to obtain a sample that is (?) of the population

representative

New cards

simple random sampling

all groups of the population have the same chance of being chosen

New cards

is a voluntary sample a good sampling method?

New cards

association

values of one variable tends to be related to values of the other variable

New cards

causation

changing the value of the explanatory variable influences the value of the response variable

New cards

T/F: association automatically means causation

no; association does not imply causation

New cards

confounding variable

third variable that is associated with both the explanatory and response variable

New cards

observational study

a study in which the researcher does not actively control the value of any variable, but simply observes the values as they naturally exist

New cards

experiment

a study in which the researcher actively controls one or more of the explanatory variables; aka randomized experiment

New cards

which study can find causation

experiment

New cards

do confounding variables exist in a randomized experiement

New cards

three explanations for why association may be observed in sample data

there is a causal relationship or association
there is an association, but it is due to confounding variables
there is no association; it is random chance

New cards

how do you avoid confounding variables

use random assignment

New cards

randomized comparative experiment

randomizing cases into different groups and then comparing results to response variable

New cards

matched pairs experiment

each case gets both treaments in random order and examine individual differences

New cards

random sampling vs random assignment

each unit has same chance of being chosen vs placing units into groups by chance

New cards

random selection allows us to

make generalizations about the population

New cards

random assignment allows us to

make conclusions about causality

New cards

frequency

number of times a value is observed in a data set

New cards

relative frequency

number of times a value is observed divided by the total number of observations

New cards

the sum of all relative frequencies is

New cards

one categorical variable summary statistics

frequency table, proportion

New cards

one categorical variable visualization

bar or pie chart

New cards

two categorical variable summary statistics

two-way table, difference is proportions

New cards

two categorical variable visualization

segmented or side-by-side bar chart

New cards

mode

the category that occurs most frequently

New cards

segmented bar chart

the height of each bar represents the frequency of one categorical variable and the segmented colors split each bar by the other categorical variable

New cards

side by side bar chart

separate bar charts are given for each group of one of the categorical variables

New cards

visualizing one quantitative variable

dot plot or histogram

New cards

shapes

symmetric or skewed

New cards

measures of center

mean or median

New cards

right skewed

tail of distribution extends out to the right

New cards

left skewed

tail of distribution extends out to the left

New cards

resistance

we say a statistic is resistant if it is unaffected by extreme values

New cards

which measure of center is impacted by utliers

mean

New cards

which measure of center is not impacted by outliers

median

New cards

left skewed mean vs median

mean < median

New cards

right skewed mean vs median

mean > median

New cards

standard deviation

a number that measures how far away the typical observation is from the mean

New cards

a larger standard deviation means

the data values are more spread out and have more variability

New cards

T/F: standard deviation is not affected by outliers and skeweness

false

New cards

IQR

Q3-Q1

New cards

only use the standard deviation as the measure of spread when you are using the (?) as the measure of center

mean

New cards

use this measure of center and this measure of spread when skewed

median and IQR

New cards

boxplot

graphical representation of the five-number summary

New cards

left skewed box plot

median line to the right of the box; left whisker longer

New cards

right skewed box plot

median line to the left of the box; right whisker longer

New cards

standard error

the standard deviation of the sample statistics

New cards

a low standard error means

statistics vary little from sample to sample

New cards

as the sample size increases, the variability of sample statistics tends to (?) and sample statistics tend to be (?) to the true value of the population parameter

decrease; closer

New cards

does the shape of the population affect the center of each sampling distribution?

New cards

confidence interval

captures parameter for a specified proportion of all samples

New cards

confidence interval formula

sample statistic ± critical value*(SE)

New cards

confidence interval interpretation

we are 95% confident that an interval captures the true population parameter

New cards

95% rule

if a distribution is approximately symmetric and bell-shaped, about 95% of the data should fall within 2 standard deviations of the mean

New cards

95% rule formula

statistic ± 2(SE)

New cards

bootstrapping

technique or simulating a sampling distribution when you do not have a population from which to sample

New cards

bootstrap sample

sample with replacement from the original sample using the same sample size

New cards

bootstrap distribution shape

bell shaped and symmetric

New cards

bootstrap distribution center

centered at sample statistic value

New cards

bootstrap confidence interval

when symmetric and bell-shaped, statistic ± 2(SE)

New cards

as the confidence level increases, the width of the confidence interval…

increases

New cards

as the sample size increases, the width of the confidence interval…

decreases

New cards

statistical test

used to determine whether results from a sample are convincing enough to allow us to conclude something about the population

New cards

goal of a hypothesis test

asses evidence provided by the sample data to test a claim made about a population parameter

New cards

null hypothesis

H-naught

New cards

alternative hypothesis

H-A

New cards

null hypothesis meaning

no change; always equals zero

New cards

alternative hypothesis meaning

claim for which we seek evidence; different from zero

New cards

hypothesis tests are always written in (?) notation

population parameter

New cards

two-sided hypothesis

H0: p1 = p2

HA: p1 does not equal p2

New cards

left-sided hypothesis

H0: p1 = p2

HA: p1 < p2

New cards

right-sided hypothesis

H0: p1 = p2

HA: p1 > p2

New cards

the null hypothesis is assumed to be (?) throughout the hypothesis test

true

New cards

how do you determine the p-value by hand?

count how many points were observed that are greater than or equal to the sample and divide that number by 100

New cards

p-value

proportion of samples that would give a statistic as extreme as the observed sample result when the null hypothesis is true

New cards

two-tailed p-value

when the alternative hypothesis contains a does not equal sign, the p-value is twice the proportion of the smallest tail

New cards

p-value < a

reject null hypothesis

New cards

p-value > a

do not reject the null hypothesis

New cards

smaller p values mean the sample results are

statistically significant

New cards

formal hypothesis test has only two possible conclusions

reject or do not reject the null hypothesis

New cards

possible significance levels

0.05, 0.01, 0.1

New cards

conclusions of hypothesis tests

conclude in terms of H-A in context of the question

New cards

type I error

occurs when we reject a true null hypothesis

New cards

type II error

occurs when we do not reject a false null hypothesis

New cards

if a = 0.05, there is a (?)% chance of a type I error

New cards

as a sample size increases, statistics in the randomization distribution will be more closely concentrated around the..

null value

New cards

a larger sample size (?) the chance of making a type II error

decreases

New cards

two methods of statistical inference

confidence intervals and hypothesis tests

New cards

sampling distribution

shows distribution of sample statistics obtained from a population, centered at true value of population parameter

New cards

bootstrap distribution

simulates a distribution of sample statistics for the population, centered at value of original sample statistic

New cards

randomization distribution

simulates a distribution of sample statistics for a population in which the null hypothesis is true, centered at value stated in null hypothesis

New cards

(-L, -U)

does not capture 0; reject the null hypothesis

New cards

(L, U)

does not capture 0; reject the null hypothesis

100

New cards

(-L, U)

captures 0; do not reject the null hypothesis