Stats Exam 1

5.0(1)
studied byStudied by 1 person
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/59

flashcard set

Earn XP

Description and Tags

vocab and formulas + examples

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

60 Terms

1
New cards

Treatment group

the group/subject that received the treatment

ex: drug for sleep; receive a drug that shows real effect 

2
New cards

Control group

the group/subject that did not receive the treatment

ex: drug for sleep; receive a placebo (sugar pill) instead of real drug

3
New cards

Randomization

randomly assigning participants to treatment or control groups

4
New cards

Why is randomization important?

Helps reduce bias in both control and treatment groups 

5
New cards

confounding factors

a variable that influences both the independent variable (what you’re testing) and the dependent variable (the outcome), making it hard to tell if the observed effect is real

ex: people who carry lighters have higher rates of lung cancer, CF = smoking

basically other factors that explain the results of the experiment

6
New cards

placebo effect

fake treatment that has no actual effects

7
New cards

double blind

neither the participants nor the researchers know who is in the treatment group and who is in the control group;

actual vs placebo no one knows until the end

8
New cards

explanatory variable

the subject that is changed/controlled in a study; often manipulated by researchers 

9
New cards

response variable

the result/subject that is observed

10
New cards

experimental study

researchers control and randomly assign participants (treatment vs control), where variables clearly affect another variable 

ex/key words: randomly, measure, experiments, “effect of..”, “applied/given”

11
New cards

observational study

researchers observe and collect data

ex/key words: study, survey, observe, “association”

12
New cards

association

Apply to: observational and experiment

Def: two variables are associated/one doesn’t necessarily effect the other 

ex: A and B happen together, but A doesn’t necessarily make B happen; glass is broken and water is spilled

13
New cards

causation

Apply to: experiments that are random and controlled; CANNOT be OBSERVATIONAL

Def: directly affects another variable 

ex: A actually makes B happen (cause effect); push glass and it’ll break

14
New cards

confounding variable examples

sunscreen can increase risk of skin cancer; A and B are associated 

CF: genetics of skin cancer

Variable A: Using sunscreen

Variable B: risk of skin cancer

15
New cards

population/population of interest

the entire group of the study

average number of hours high school student sleep

ex: all high school students

16
New cards

sample

the group that is being studied

average number of hours high school student sleep (150 students)

ex: the number of students in this study = 150 students

17
New cards

representative

choosing sample (participants in study) at random; will be more likely to be representative

18
New cards

Identify population, sample, explanatory and response variable, population of interest, type of variables (numerical, ordinal, categorical) in the study:

Average number of hours all 150 high school students in the U.S. sleep per night

Population: All high school students in the U.S.

Sample: 150 high school students

Explanatory variable: none, because it’s observational.

Response variable: Number of hours of sleep per night

Types of variables:

Number of hours of sleep: Numerical (quantitative, continuous)

Grade level (if collected): Ordinal (9th, 10th, 11th, 12th)

Gender (if collected): Categorical (male, female, other)

19
New cards

simple random sampling 

every member has a chance of being selected

20
New cards

convenience sampling

individuals that are selected who are easily accessible/convenient

21
New cards

selection bias

when sample (individuals) collected not representative of the population (topic of study)

22
New cards

In a data set, row is____, the column is _____?

row = observational unit/case

column = variable

23
New cards

Numerical variable — include what two types? (Quantity)

discrete and continuous 

Definition: measure/record numerical data

ex: Number of hours students sleep per night — 7.5, 8 hrs

Age of people → 18, 25, 40 years

Height in cm → 160, 175, 182 cm

Test scores → 85, 92, 78

24
New cards

Categorical variable — consist of what two types? (Quality)

ordinal and nominal

Definition: represent categories/groups NO NUMBERS

ex: Gender → male, female, non-binary

Eye color → blue, brown, green

Type of pet → dog, cat, bird

Yes/No responses → yes, no

25
New cards

ordinal variable def/examples

meaningful order

ex: Education level → High School < Bachelor’s < Master’s < PhD

Satisfaction rating → Unsatisfied < Neutral < Satisfied < Very Satisfied

26
New cards

nominal variable def/ex

no natural order

ex: Eye color → Blue, Brown, Green

drinks —> pepsi, sprite

Type of pet → Dog, Cat, Bird

27
New cards

discrete def/ex

numerical value that is countable/separate values 

ex: Number of siblings → 0, 1, 2, 3…

Number of cars in a household → 0, 1, 2…

28
New cards

continuous def/ex

def: any value in range (measured)

ex: height, weight 

29
New cards

numerical vs categorical + subsections

Numerical — can be measured (quantity)

  • discrete: countable numbers (whole numbers) ; ex: number of siblings

  • continuous: any value of range; ex: height, weight, fraction, decimal

Categorical — category of group; no numbers, quality

  • nominal: no order; ex: eye color, gender, pet

  • ordinal: has order; ex: ratings (satisfied neutral bad), educational level (freshman soph junior senior)

30
New cards

Dot plot question examples

  1. fewest total number in data set = lowest value on chart;  ex: 1-4, one is the lowest

  2. largest total number = highest value on chart

  3. most frequently observed total number = has the most/highest values on chart; ex: column 3 has the most people voted

  4. least frequently observed = the shortest/least value on chart

31
New cards

intervals (x axis)

intervals = range of data set

500-1000

1000-1500

32
New cards

frequency (y axis)

number/count in the intervals

500-1000 has 2

1000-1500 has 5

33
New cards

symmetric vs right vs left vs bell/no bell

  1. symmetric = data clustered in middle w/ bell

  2. right skewed = tail on right

  3. left skewed = tail on left

  4. not bell but symmetric = two bell, split in half

34
New cards

parameters

values calculated from population

ex: average number of hours student sleep

35
New cards

statistics

values calculated from samples

ex: 150 students randomly selected from diff schools

36
New cards

Identify population vs sample vs parameter vs stat example

population: all high school students

sample: 150 students randomly selected

parameter: average hours for all student

statistic: average hours for 150 students

37
New cards

mean formula

x bar = sum of all values/total number of values

ex: 1 2 3

x=1+2+3/3

38
New cards

median formula

middle value

odd = middle single value; ex: 123, median = 2

even = middle two values/2; ex: 123456, median = 3+4/2

39
New cards

Reading the mean vs median on a histogram

symmetric diagram: mean = median

right skewed: mean greater than median

left skewed: mean less than median

  • mean is the tail on chart

40
New cards

range formula

range = max - min value

41
New cards

IQR Interquartile Range Formula

IQR = Q3-Q1

42
New cards

Percentile Ranges: 25, 50, 75

25 = median of lower half data, Q1

50 = median (middle line on boxplot)

75 = upper half after the median, Q3

43
New cards

Finding IQR Examples

50, 51, 56, 61, 70, 71, 80, 84

Q1: 51,56/2

Q3: 71,80/2

IQR: Q3-Q1

44
New cards

Notations for Parameters and Statistic 

Parameter

Mean: u

variance: o²

SD: o

proportion: p

Statistics

Mean: x bar

variance: s²

SD: s

proportion: p^

45
New cards

deviation formula

Sample deviation: x-x bar

Population deviation: x-u

46
New cards

squared deviation formula

(x-x bar)² OR (x-u)²

47
New cards

variance formula

measure average square deviation

Population (o²) = sum of (x-u)²/n

sample variance (s²) = sum of (x-x bar)²/n-1

48
New cards

SD formula; square root variance

Population SD: square root population variance

sample SD: square root sample variance

49
New cards

proportion formula

cases in category/total number of cases

ex: 20 out of 50 students have blue eyes

p=20/50

50
New cards

Proportion tyoes: observed sample vs population

sample = p hat

population = p

51
New cards

probability formula

number of desired outcomes/total number of possible outcomes

ex: chances of rolling a 4 ; 123456 = 1/6

52
New cards

Probability is always between 0 and 1

0 = unlikely

0.5 / 50% = half likely/not likely

1 = positive will occur

53
New cards

sensitivity vs specificity 

true positive vs true neg

specificity = 1-value

54
New cards

base rate def

proportion of population 

55
New cards

independent conclusion 

two variables A and B, is independent if A is not affected by B

56
New cards

null hypothesis (Ho)

no change, difference, or relationship between variables

if null is true = due to chance

null value = zero means independent

57
New cards

alternative hypothesis (Ha)

there is change, a difference, or relationship between variables

= not due to chance

58
New cards

observed difference

P (value A) - P (value B)

59
New cards

p-value

a null model used to calculate probability

  • helps to decide whether to reject the null hypothesis

60
New cards

P-Value Chart

greater than 0.10 = little evidence

between 0.05 and 0.10 = some evidence

between 0.01 and 0.05 = strong evidence

between 0.001 and 0.01 = very strong evidence

less than 0.001 = extreme strong evidence