statistics 121 test 1

4.7(3)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/102

flashcard set

Earn XP

Description and Tags

Statistics

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

103 Terms

1
New cards
population
the entire group of individuals that is the target of our interest; generally too big to actually measure or observe
2
New cards
sample
subgroup of the population which we can examine or observe, measure and collect data from
3
New cards
individual
single entity that is being observed
4
New cards
variable
characteristic measured on each individual
5
New cards
quantitative variable
variable whose possible values are meaningful numbers
6
New cards
categorical variable
variable whose possible responses are non-quantitative categories (words/labels/attributes)
7
New cards
measurement
value of a variable for an individual
8
New cards
data
measurements for a set of individuals (Goal of Statistics: convert this to useful information)
9
New cards
data set
data identified with contextual information (who was observed, what was measured, why is study done) often given in a table
10
New cards
EDA (exploratory data analysis) goals
- organize and summarize data
- discover features, patterns and striking deviations
- interpret patterns in context
- include visual displays and numerical values
11
New cards
single variable pattern
distribution of a variable: summary of data one variable at a time (all the possible values and how often they occur)
12
New cards
process of statistical problem solving
1. Collect data
2. Summarize data
3. Interpret data
13
New cards
parameter
numerical fact about the variable in the population
14
New cards
statistic
numerical fact about the variable in the sample
15
New cards
convenience sampling
select individuals in the easiest possible way
16
New cards
volunteer response sampling
individuals select themselves
17
New cards
quota sampling
force the sample to meet specified quotas
18
New cards
simple random sample (SRS)
every possible set of a specified size has an equal chance of being selected
19
New cards
cluster sampling
a random sample of clusters is taken and all individuals in selected clusters are included in sample
20
New cards
stratified random sample
select a random sample (SRS) from each stratum and combine these SRSs together
21
New cards
multi-stage sample
take a sample at each hierarchical level of the population
22
New cards
treatment
the condition applied to a subject in an experiment (one of the subcategories/values of the explanatory variable)
23
New cards
lurking variables
variables that affect both the explanatory and response variables but are not measured or included as a planned factor in the study
24
New cards
control
an effort to reduce the effects of lurking variables
25
New cards
confounding
situation in which effects of lurking variables cannot be distinguished from effects of factors
26
New cards
historical comparison experiments
study involving only one treatment, where treated subjects are compared to untreated subjects from some external source
27
New cards
unreplicated experiments
assigns one subject only to each treatment
28
New cards
confounded experiments
treatment groups are handled differently in some way OTHER than the treatment
29
New cards
undercoverage
some individuals have no possibility of being selected
30
New cards
non-response
some selected individuals choose not to be in the sample because they refuse to provide information or cannot be contacted
31
New cards
misleading response
people lie or give inaccurate answers (often about sensitive issues)
32
New cards
interviewer effect
person asking questions influences responses (for in-person/phone surveys)
33
New cards
question order effect
the order that questions are asked promotes certain responses
34
New cards
question wording
the way a question asked leads, misleads or confuses
35
New cards
open questions
allow for almost unlimited possible responses (short answer), less restrictive but more difficult to analyze
36
New cards
closed questions
limit response options (multiple choice), easier to analyze but may be biased by the options provided. should include "other/unsure" option
37
New cards
observational studies
individuals are not assigned to treatments, are self selected, cannot conclude causation
38
New cards
experiment
study where individuals are assigned to treatments, causation okay if valid
39
New cards
subject
individual to which treatment is applied
40
New cards
response variable
characteristic measure on each subject; outcome of interest
41
New cards
explanatory variable
characteristic/measurement that is use to predict or explain changes in the response variable; variable we think could help us know about the response (measured earlier or more easily); independent variable
42
New cards
factor
planned explanatory variable
43
New cards
comparison
two or more groups; controls lurking variables by including comparison treatments
44
New cards
randomization
randomly assign subjects to groups; neutralizes effects of lurking variables by assigning subjects to treatments using a random device
45
New cards
replication
two or more subjects in each group; assign more that one subject to each treatment to detect important effects
46
New cards
double blinding
neither subjects nor the researchers in direct contact with the subjects know which treatment is received
47
New cards
placebo effect
favorable response of a human subject to a placebo because of trust in the medical provider or belief that the treatment will work
48
New cards
diagnostic bias
diagnosis of subjects is biased by preconceived notions about the effectiveness of the treatment (person administering treatments expects certain responses)
49
New cards
lack of realism
realism is compromised by the conditions of the study
50
New cards
hawthorne effect
people in experiment behave differently than they would normal behave, not like real life
51
New cards
non-compliance
subjects fail to submit to the assigned treatment or refuse to follow the protocol of the experiment
52
New cards
principles of data ethics
• safety and well-being of the subjects must be protected
• all individuals must give their informed consent before data are collected
• individual data must be kept confidential
53
New cards
randomized controlled experiment
randomly assign subjects to treatments, grouped by treatment
54
New cards
randomized block design
randomly assign to treatments within blocks, grouped by treatment or by block
55
New cards
benefits of randomized block design (RBD)
- removes confounding of lurking variables
- reduces chance variation by removing variation associated with the blocking variable
- yields more precise estimates of chance variation
56
New cards
matched pairs
two treatments; matched individuals or two measurements per subject
57
New cards
three principles of experiments
- randomly assign two treatments to two individuals or randomize the order of treatment application to each individual
- replication = number of pairs
- compare the two treatments
58
New cards
analysis of distribution of quantitative data
- always plot data first
- look for an overall pattern and for striking deviations
- look at shape, center, spread of distribution
- add numerical summaries to supplement graph
- if pattern is regular, use mathematical model to describe data
59
New cards
symmetric and bell shaped distribution examples
blood pressure, IQ, biological factors
60
New cards
symmetric and bell shaped distribution
mean, median, and mode are the same
61
New cards
right skewed distribution
concentration of data on left, tail extends to the right; mean > median
62
New cards
right skewed distribution examples
salary, home price, children, economic variables
63
New cards
left skewed distribution
concentration of data on right and the tail on the left; median > mean
64
New cards
left skewed distribution examples
test scores, olympic high jump
65
New cards
bimodal distribution
a distribution with two modes
66
New cards
bimodal distribution examples
speed limits, restaurant patrons
67
New cards
flat or uniform distribution
relatively equal across graph
68
New cards
flat or uniform distribution examples
rolling a die, day of the month born
69
New cards
center
typical, middle value; half of data to each side
70
New cards
spread
consistency/inconsistency of data; look for maximum and minimum
71
New cards
outliers
values that are far outside most of data
- is data point miscoded?
- unusual conditions?
- should data point be excluded?
72
New cards
mode
most frequently occurring score, corresponds to a peak
73
New cards
median
the middle score in a distribution; half the scores are above it and half are below it
74
New cards
mean
center of gravity; the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores
75
New cards
mean vs median
- construct graph to evaluate skewness and outliers
- use median if distribution is markedly skewed or outliers are present
- use mean if distribution is roughly symmetric
76
New cards
range
maximum - minimum
77
New cards
interquartile range (IQR)
the difference between the first and third quartiles
78
New cards
standard deviation
average distance of values from the mean
79
New cards
first quartile (Q1)
a number for which 25% of the data is less than that number; same as the median of the data which are less than the overall median
80
New cards
second quartile (Q2)
median
81
New cards
third quartile (Q3)
a number for which 75% of the data is less than that number; same as the median of the part of the data which is greater than the median
82
New cards
5 number summary vs 2 number summary
use 5 number for skewed, and 2 number for symmetric
83
New cards
5 number summary
minimum, Q1, median, Q3, maximum
84
New cards
random phenomenon
individual outcome unpredictable, but outcomes from large number of repetitions follow regular pattern
85
New cards
sample space
the set of all possible outcomes
86
New cards
event
a collection of possible outcomes
87
New cards
probability of an outcome
The proportion of times that an outcome occurs in many, many repetitions of the random phenomenon
88
New cards
probability rules
- 0
89
New cards
theoretical probability
number of favorable outcomes divided by total number of possible outcomes
90
New cards
empirical probability
number of outcomes divided by total of repetitions
91
New cards
law of large numbers
As the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the theoretical probability of the outcome
92
New cards
probability
the long-run relative frequency with which an event will occur
93
New cards
probability distribution
all possible events and their associated probabilities
94
New cards
random variable
a variable whose value is a numerical outcome of a random phenomenon
95
New cards
continuous random variable
a variable that can take on any possible value, all values cannot be listed
96
New cards
discrete random variable
variable whose possible values are a list of distinct values
97
New cards
𝜇
mean of a population
98
New cards
x-bar
mean of a sample
99
New cards
s
standard deviation of a sample
100
New cards
𝜎
standard deviation of a population