1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
variable
quantity that is observed in an experiment
sample space
set of all possible outcomes (can be finite or infinite)
discreet sample space
finite number of elements (or countably infinite)
continuous sample space
infinite number of elements (not countable)
ex: 2 < x < 5
independent event
the occurrence or non-occurrence of one event does not change the probability of the other event's occurrence
ex: drawing cards with replacement
dependent event
the occurrence or non-occurrence of one event does change the probability of the other event's occurrence
ex: drawing cards without replacement
mutually exclusive
(disjoint)
two events do not intersect / do not occur at the same time
always dependent
collectively exhaustive
includes all possible outcomes
partition
events are said to _______ a sample space S if they are mutually exclusive and collectively exhaustive
sensitivity
probability that a test shows positive, given that the patient has the disease
specificity
probability that a test shows negative, given that the patient does not have the disease
descriptive statistics
the act of describing, organizing, and summarizing data (using tables, plots, charts, etc.)
statistics
the collection, analysis and interpretation of data
bar graph
graph used for categorical (qualitative) data (bars do not touch, separated by categories)
histogram
graph used for quantitative data (bars touch)
scatter plot
graph used to show patterns in data, may include line of regression
line of regression
line of best fit, used to show a linear relationship in a scatter plot
stem and leaf plot
plot used to order data by number places
frequency polygon
histogram with the midpoints connected to show progress in data (great for comparisons)
circle graph
graph that shows categories (qualitative data) as parts of a whole
grouped frequency table
table that groups quantitative data into classes (used for histograms)
inferential statistics
uses sample statistics to draw conclusions (inferences) about population parameters
parameter
a number that describes some characteristic of a population (μ, N, σ)
statistic
a number that describes some characteristic of a sample (x̄, n, S)
simple random sampling
each member of the population has the same chance of being selected for the sample
(every possible sample of size n has the same chance to represent the entire population of size N)
random number table
table used to choose a random sample (n) from a population (N)
stratified random sampling
members of the population are divided into two or more homogenous subsets (strata) that share a similar characteristic, a random sample is then taken from each stratum
proportionate allocation
we may select a number from each stratum that is proportional to the breakdown of these groups (ratio)
stratified random sampling
ensures each segment of the population is represented
differences within groups = small
differences between groups = bigger
disproportionate allocation
(optimum allocation)
used if larger samples are taken in the strata with greatest variability to capture the variability
1 in k systematic sampling
each member of the population is assigned a number
a starting number is randomly selected from amongst the first k members, then every kth member is selected from the starting number (i.e. every 3rd, k = 3)
probability sampling
every member of a population has an equal chance of being selected to the sample (best for representing whole population)
cluster sampling
divide the population into mini-populations (clusters), then randomly select a few of them, include all members of the cluster in your sample
differences within clusters are large
differences between clusters are small
can obtain a bigger sample with this method
non-probability sampling
not every member of a population has an equal chance of being selected to the sample
(ex: convenience sampling, voluntary response)
contains bias and unreliable data
qualitative random variable
variable that places an individual into one of a group of categories (cannot be measured)
sex
eye color
hair color
race
etc.
nominal data
qualitative data with names only
sex
states
continents
yes or no
ordinal data
qualitative data that can be ordered/ranked
bed sizes
degrees of burns
stages of disease
days of week
quantitative random variable
variable that takes on a numerical value that is measurable
interval data
quantitative data with no start point or "0" value
clock times
calendar years
temperature (°C or °F)
ratio data
quantitative data with a start point or "0" value
time lapses
age
pressure
temperature (Kelvin)
length
mass
independent variable
(x) variable that is manipulated in an experiment
dependent variable
(y) variable that is influenced by independent variable
correlation
describes the strength and direction of the linear relationship between 2 quantitative variables (has no units)
(the closer to 1 the correlation coefficient is, the stronger the linear relationship)
dichotomous variable
type of variable that only has two values
sex (male or female)
coin-flip (heads or tails)
exam result (pass or fail)
variance
standard deviation squared
measure of the variability (scatter) of values from the mean of a data set
standard deviation
square root of variance
measure of the variability (scatter) of values from the mean of a data set
low = values clustered close to mean
high = values spread farther from mean
sample standard deviation
square root of the sample variance
better to use in the presence of outliers, since standard deviation is thrown off by them
adding/subtracting
______/___________ a constant from xi changes the mean, but not the standard deviation, variance, or range
it only shifts data to the right (+c) or left (-c)
multiplying
___________ a constant by xi changes the mean, standard deviation, variance, and range
coefficient of variation
a way to interpret the relative magnitude of the standard deviation
useful for comparing two dispersions of two variables
higher percent CV = more variable data
the empirical rule
for mound-shaped distributions only
gives the approximate percentage of measurements that fall within 1, 2 and 3 standard deviations from the mean (applies to populations or samples)
68-95-99.7% rule
z score
represents the distance between the measurement and the mean, expressed in standard deviations
(tells you how many standard deviations a given value (x) is above or below the mean)
positive = above mean
negative = below mean
Chebyshev's Theorem
for any set of data and for any constant (k), while k > 1, the proportion of the data that must lie within k standard deviations on either side of the mean is at least 1 - (1/k²)
k = number of standard deviations from the mean