Stats Exam 1

5.0(1)

Studied by 1 person

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/59

Earn XP

Description and Tags

vocab and formulas + examples

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

60 Terms

New cards

Treatment group

the group/subject that received the treatment

ex: drug for sleep; receive a drug that shows real effect

New cards

Control group

the group/subject that did not receive the treatment

ex: drug for sleep; receive a placebo (sugar pill) instead of real drug

New cards

Randomization

randomly assigning participants to treatment or control groups

New cards

Why is randomization important?

Helps reduce bias in both control and treatment groups

New cards

confounding factors

a variable that influences both the independent variable (what you’re testing) and the dependent variable (the outcome), making it hard to tell if the observed effect is real

ex: people who carry lighters have higher rates of lung cancer, CF = smoking

basically other factors that explain the results of the experiment

New cards

placebo effect

fake treatment that has no actual effects

New cards

double blind

neither the participants nor the researchers know who is in the treatment group and who is in the control group;

actual vs placebo no one knows until the end

New cards

explanatory variable

the subject that is changed/controlled in a study; often manipulated by researchers

New cards

response variable

the result/subject that is observed

New cards

experimental study

researchers control and randomly assign participants (treatment vs control), where variables clearly affect another variable

ex/key words: randomly, measure, experiments, “effect of..”, “applied/given”

New cards

observational study

researchers observe and collect data

ex/key words: study, survey, observe, “association”

New cards

association

Apply to: observational and experiment

Def: two variables are associated/one doesn’t necessarily effect the other

ex: A and B happen together, but A doesn’t necessarily make B happen; glass is broken and water is spilled

New cards

causation

Apply to: experiments that are random and controlled; CANNOT be OBSERVATIONAL

Def: directly affects another variable

ex: A actually makes B happen (cause effect); push glass and it’ll break

New cards

confounding variable examples

sunscreen can increase risk of skin cancer; A and B are associated

CF: genetics of skin cancer

Variable A: Using sunscreen

Variable B: risk of skin cancer

New cards

population/population of interest

the entire group of the study

average number of hours high school student sleep

ex: all high school students

New cards

sample

the group that is being studied

average number of hours high school student sleep (150 students)

ex: the number of students in this study = 150 students

New cards

representative

choosing sample (participants in study) at random; will be more likely to be representative

New cards

Identify population, sample, explanatory and response variable, population of interest, type of variables (numerical, ordinal, categorical) in the study:

Average number of hours all 150 high school students in the U.S. sleep per night

Population: All high school students in the U.S.

Sample: 150 high school students

Explanatory variable: none, because it’s observational.

Response variable: Number of hours of sleep per night

Types of variables:

Number of hours of sleep: Numerical (quantitative, continuous)

Grade level (if collected): Ordinal (9th, 10th, 11th, 12th)

Gender (if collected): Categorical (male, female, other)

New cards

simple random sampling

every member has a chance of being selected

New cards

convenience sampling

individuals that are selected who are easily accessible/convenient

New cards

selection bias

when sample (individuals) collected not representative of the population (topic of study)

New cards

In a data set, row is____, the column is _____?

row = observational unit/case

column = variable

New cards

Numerical variable — include what two types? (Quantity)

discrete and continuous

Definition: measure/record numerical data

ex: Number of hours students sleep per night — 7.5, 8 hrs

Age of people → 18, 25, 40 years

Height in cm → 160, 175, 182 cm

Test scores → 85, 92, 78

New cards

Categorical variable — consist of what two types? (Quality)

ordinal and nominal

Definition: represent categories/groups NO NUMBERS

ex: Gender → male, female, non-binary

Eye color → blue, brown, green

Type of pet → dog, cat, bird

Yes/No responses → yes, no

New cards

ordinal variable def/examples

meaningful order

ex: Education level → High School < Bachelor’s < Master’s < PhD

Satisfaction rating → Unsatisfied < Neutral < Satisfied < Very Satisfied

New cards

nominal variable def/ex

no natural order

ex: Eye color → Blue, Brown, Green

drinks —> pepsi, sprite

Type of pet → Dog, Cat, Bird

New cards

discrete def/ex

numerical value that is countable/separate values

ex: Number of siblings → 0, 1, 2, 3…

Number of cars in a household → 0, 1, 2…

New cards

continuous def/ex

def: any value in range (measured)

ex: height, weight

New cards

numerical vs categorical + subsections

Numerical — can be measured (quantity)

discrete: countable numbers (whole numbers) ; ex: number of siblings
continuous: any value of range; ex: height, weight, fraction, decimal

Categorical — category of group; no numbers, quality

nominal: no order; ex: eye color, gender, pet
ordinal: has order; ex: ratings (satisfied neutral bad), educational level (freshman soph junior senior)

New cards

Dot plot question examples

fewest total number in data set = lowest value on chart; ex: 1-4, one is the lowest
largest total number = highest value on chart
most frequently observed total number = has the most/highest values on chart; ex: column 3 has the most people voted
least frequently observed = the shortest/least value on chart

New cards

intervals (x axis)

intervals = range of data set

500-1000

1000-1500

New cards

frequency (y axis)

number/count in the intervals

500-1000 has 2

1000-1500 has 5

New cards

symmetric vs right vs left vs bell/no bell

symmetric = data clustered in middle w/ bell
right skewed = tail on right
left skewed = tail on left
not bell but symmetric = two bell, split in half

New cards

parameters

values calculated from population

ex: average number of hours student sleep

New cards

statistics

values calculated from samples

ex: 150 students randomly selected from diff schools

New cards

Identify population vs sample vs parameter vs stat example

population: all high school students

sample: 150 students randomly selected

parameter: average hours for all student

statistic: average hours for 150 students

New cards

mean formula

x bar = sum of all values/total number of values

ex: 1 2 3

x=1+2+3/3

New cards

median formula

middle value

odd = middle single value; ex: 123, median = 2

even = middle two values/2; ex: 123456, median = 3+4/2

New cards

Reading the mean vs median on a histogram

symmetric diagram: mean = median

right skewed: mean greater than median

left skewed: mean less than median

mean is the tail on chart

New cards

range formula

range = max - min value

New cards

IQR Interquartile Range Formula

IQR = Q3-Q1

New cards

Percentile Ranges: 25, 50, 75

25 = median of lower half data, Q1

50 = median (middle line on boxplot)

75 = upper half after the median, Q3

New cards

Finding IQR Examples

50, 51, 56, 61, 70, 71, 80, 84

Q1: 51,56/2

Q3: 71,80/2

IQR: Q3-Q1

New cards

Notations for Parameters and Statistic

Parameter

Mean: u

variance: o²

SD: o

proportion: p

Statistics

Mean: x bar

variance: s²

SD: s

proportion: p^

New cards

deviation formula

Sample deviation: x-x bar

Population deviation: x-u

New cards

squared deviation formula

(x-x bar)² OR (x-u)²

New cards

variance formula

measure average square deviation

Population (o²) = sum of (x-u)²/n

sample variance (s²) = sum of (x-x bar)²/n-1

New cards

SD formula; square root variance

Population SD: square root population variance

sample SD: square root sample variance

New cards

proportion formula

cases in category/total number of cases

ex: 20 out of 50 students have blue eyes

p=20/50

New cards

Proportion tyoes: observed sample vs population

sample = p hat

population = p

New cards

probability formula

number of desired outcomes/total number of possible outcomes

ex: chances of rolling a 4 ; 123456 = 1/6

New cards

Probability is always between 0 and 1

0 = unlikely

0.5 / 50% = half likely/not likely

1 = positive will occur

New cards

sensitivity vs specificity

true positive vs true neg

specificity = 1-value

New cards

base rate def

proportion of population

New cards

independent conclusion

two variables A and B, is independent if A is not affected by B

New cards

null hypothesis (Ho)

no change, difference, or relationship between variables

if null is true = due to chance

null value = zero means independent

New cards

alternative hypothesis (Ha)

there is change, a difference, or relationship between variables

= not due to chance

New cards

observed difference

P (value A) - P (value B)

New cards

p-value

a null model used to calculate probability

helps to decide whether to reject the null hypothesis

New cards

P-Value Chart

greater than 0.10 = little evidence

between 0.05 and 0.10 = some evidence

between 0.01 and 0.05 = strong evidence

between 0.001 and 0.01 = very strong evidence

less than 0.001 = extreme strong evidence