1/58
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
population
the set of all “subjects” relevant to the scientific hypothesis under examination
variables
characteristics that differ among individuals
parameters
quantities describing a population (denoted by Greek letters)
census
a collection of data where the population is examined
random sample
each and every member of the population has an equal chance of being selected and each member is selected independently of others
mean
denoted by a bar
standard deviation
denoted by an “s”
sample
the subset of cases selected from a statistical population that are actually examined during a particular study
sample statistics
calculated from the collected sample and used to estimate the population parameters (denoted by roman letters)
how to get a good sample
take a random sample, be unbiased an precise
to get a good sample
carefully define your statistical population and select a sample that is as representative of the population a possible, where each subject is selected randomly and measurements are precise
bad samples
volunteer sample or a sample of convenience
experimental study
assigning treatment randomly, creating groups, imposing change
observational study
relying on comparisons of already existing conditions
2 types of variables
numerical (quantitative) and categorical (qualitative)
2 types of numerical variable
interval (arbitrarty zero) and ratio (true zero)
2 types of categorical variables
nominal (no order) and ordinal (ordered)
frequency distribution
describes the number of times each value of a variable occurs
histograms
used for numerical data - x axis has a continuous scale, data are “binned” into continuous categories, the bins are touching
histograms y-axis can be
frequency (count of observations in each bin), proportion(of the total observations in each bin) and density(the proportion of the total observations per unit of the bin width)
location or central tendency
distributions with a different central measurement using the mean
spread or scale
distributions with a different spread measured using the standard deviation
shape or skew
distributions with a long tail on one side or the other
mean
arithmetic average
median
middle of the data
mode
most commonly occurring observations
scale
most basic (max - min), not very informative
variance
“expected” squared difference between an observation and the mean
standard deviation is
positive square root of the variance
what is meant by “estimation”
it’s using the sample data to learn about the popualtion
estimation
the process of inferring a population parameter from sample data
uncertainty
a situation in which something is not known; in statistics it is the error of an estimate
Sampling distribution
the distribution of all the values for an estimate that we might have obtained when we sampled a population
a 95% confidence interval is a
range of values, calculated from sample data, that would contain the true population parameter in 95 out of 100 samples if the sampling process were repeated
uncertainty
decreases an precision increases with sample size
hypothesis testing
to determine whether an estimate can be simply explained by chance or is it special
Null hypothesis
is a specific statement made about population for the sake of argument, forces us to take a skeptical view
null hypothesis is used to
create a null model, compare test statistic calculated from the sample to the model
H0 is rejected if
we are surprised by the test statistic
P means
probability of observing a test statistic as extreme as, or more more extreme than, the one observed, assuming H0 is true
significance level α
a probability used as a criterion for rejecting the null hypothesis
P-value > α
fail to reject H0
P-value < α
reject H0
two-tailed tests
deviation is either direction would reject null hypothesis
type I error (α)
rejecting a true null hypothesis (false posititve)
type II error (β)
Failing to reject a false null hypothesis (false negative)
Power
the probability of correctly rejecting a false H0
Power depends on
how different the truth is from the null hypothesis, type I error rate, and sample size
things to consider when designing an experiment
reduce bias and decrease sampling error
reduce bias
have a control group, use randomization, use blinding
decrease sampling error
use replication, ensure balance, use blocking, implement extreme treatments
Control group
units that are similar to the treatment units except that they do not receive the treatment
random assignment
units that are otherwise “identical” are assigned to be controls or treatments
blinding
concealing information about whether a participant is in the control or treatment group (single blind) and sometimes researchers (double blind)
replication
application of treatment ti multiple, independent experimental subjects or units
balance
an equal number of units in the control and treatment minimizes the sampling error in both
blocking
divide experimental units into groups with known confounding variables
extreme treatments
a treatment you may add to an experiment to see if by doing more (or less) of a treatment will elicit more (or less) of an effect