1/120
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
population
entire group we are interested in studying
sample
subset of the population that we actually study
samples should
represent the population
descriptive procedures
used to summarize, organize, and simplify sample data and result in statistics (designated by english letters)
inferential procedures
allow us to study samples and make generalizations about the population to estimate the population parameters (greek letters)
sampling
simple random (assign #s to ALL and pick randomly), quota/stratified random (first break into groups based on x characteristic and then choose a batch from each group), convenience sampling (using ‘easy’ population)
sampling error
natural discrepancy btwn a sample statistic and corresponding population parameter
bias in samples
selection bias (non-representative group), non-response bias (some groups are less likely to respond)
empiricism
scientists make predictions based off observations
theory
general statement about the causal relationship btwn variables
hypothesis
prediction about specific events derived from theories
scientific method steps
observe, predict, test, analyze, conclude, publish, restart
parsimony
given two competing theories, the simplest explanation is preferred
anecdotal evidence
based on personal experiences, often biased, least certain
correlational statistics
measure scores on two variables and determines potential relationship (should not imply causation)
experimental
researcher manipulates one variable and measures effect on another
independent variable
manipulated
dependent variable
measured
quasi-experimental
compare groups based on naturally occurring variables, no manipulation
scales of measurement: nominal
categorical, cannot be ranked (favorite color)
scales of measurement: ordinal
indicates rank or order but not magnitude of difference (places in a race)
scales of measurement: interval
equal intervals separate adjacent scores, no true zero (celsius or Fahrenheit temperature)
scales of measurement: ratio
true zero exists, with equal intervals separating adjacent scores (height, weight, etc)
bar chart
for nominal or ordinal data, bars have spaces
histogram
for interval or ratio data, no spaces
charts should
make information easier to understand and compare
normal distribution
bell-shaped, symmetrical
bimodal distribution
two peaks
skewed distribution
scores pile-up on one end with tail (positive skew= tail to +, negative skew = tail to -)
uniform / rectangular distribution
scores all have the same frequency
descriptive statistics
goal: summarize data in a clear and understandable way
types: mean, median, and mode
mode
score w/ highest frequency
only possible measure for nominal data
median
middle value, 50th percentile score
good for ordinal data, highly skewed data, and open ended distributions
mean
average
mu = population mean, ex bar = sample mean
preferred measure of central tendency
variability
do the scores cluster about the center point or do they spread out?
measures of variability
range, interquartile range, standard deviation and variance
range
difference between highest and lowest scores in distribution, very reliant on outliers
interquartile range
range between the first and third quartile (Q1-Q3)
75th percentile score - 25th percentile score = IQR
helps to eliminate outliers
mean absolute deviation (MAD)
average absolute deviation of a data set
variance
average squared deviation (s²)
Variance = σ² = Σ(x - μ)² / N
variance = s² = Σ(x - xbar)² / N-1
standard deviation
average deviation from the mean = s
sqrt [Σ(x - xbar)² / N-1]
standard deviation is
still impacted by outliers
variance and standard deviation are
always >0
only used for interval and ratio data
can have same mean w/ different variance
can have different mean w/ same variance
for normal distribution
68% of scores fall in the region ± 1 SD from the mean
deviation score
X - Mean
z-score
number of standard deviations above or below the mean
z = (X - μ) / σ
standard normal distribution
has a mean of 0 and a standard of 1
the larger the absolute value of the z-score
the less frequently it occurs
the area under the curve
corresponds to z-scores, is proportional to the frequency or scores
finding a score from a percentile rank
convert to z-value and convert z to raw score
three uses of z-scores (units = SDs)
examine the relative status of an individual score in a normal distribution
compare scores coming from different variables
examine the relative status of an entire sample in a normal population
analytic view of probability
number of occurrences of the event divided by the total number of occurrences of all events
frequency/N
frequentistic view of probability
probability in terms of past performance
subjective view of probability
based on personal belief in the likelihood of an outcome
probability notation
a proportion, fraction, or percentage
multiplicative law
used for probability of joint occurrence of two or more events (&)
additive law
used for the probability of occurrence of one or more mutually exclusive events (or)
sampling distribution of the means
the frequency distribution of all possible sample means drawn from a population
sampling distribution of the means attributes
sample means clump together around the true population mean
a different sampling distribution is obtained with each sample size (N)c
central limit theorem
sampling distribution will approach a normal distribution, even if the population distribution does not
the mean of the sampling distribution always equals the mean of the pop. distribution (mu sub xbar = mu)
the standard deviation of the sampling distribution is determined by: the standard deviation of the population and the sample size (σxbar=σ / √n)
standard error of the mean
the standard deviation of the sampling distribution of the means
becomes smaller as N increases
how do you assess relative standing of a single sample mean to all samples from population?
transform sample mean into z-score
observed effects can be due to
systematic effects or chance or some combination
errors in decision making
type 1: reject H0 when it is actually true (probability = alpha = 0.05) (probability of avoiding = 1-alpha)
type 2: retain H0 when it is false (probability = beta, power = 1=beta [probability of rejecting it when it is false])
things that alter power
larger N increases power, variability decreases power, strength of effect (stronger increases power, weaker decreases power)
when to use z-test
comparing sample to population, sd is knownw
when to use 1 sample t-test
comparing sample to population, sd is unknown
when to use independent samples t-test
comparing two groups that are measured independently
when to use related samples t-test
comparing twp groups that are measured in some related way (repeated measures, matched samples, or natural pairs)
when to use one-way anova
comparing multiple groups with 1 independent variable with >2 levels
when to use two-way anova
comparing multiple groups with 2 independent variables
between subjects anova
independent samples
within subjects anova
related samples
when to use Pearson correlation coefficient [r]
when identifying a relationship between variables
two ways to estimate population mean
point estimation (exact value)
interval estimation (using confidence interval)
confidence interval
xbar ± margin of error
CI.95=x̄ ± (Sx-) (t0.05)
we can say with a probability of .95 that the interval between x- and x+ includes the true mean x for —.
effect size
how big is the significant difference?
cohen’s d= units of standard deviations = mean1-mean2/ SD
small= 0.2, medium = 0.5, large = 0.8
statistical hypothesis
z-test: H0: mu=pop mean, H1: mu does not equal pop mean
1 sample t-test: H0: mu=pop mean, H1: mu does not equal pop mean
related and independent samples t-tests: H0: mud=0, H1: mud does not equal 0
one-way ANOVA: H0: mu1=mu2=mu3=mun, H1: mu1 does not equal mu2 does not equal mun
two-way ANOVA: mu1=mu2=mu3=mun, H1: mu1 does not equal mu2 does not equal mun (for both main effects) & H0= no interaction, H1= interaction
P’s CC: H0: rho=0, H1: rho does not = 0
all t-tests look at
ratio of systematic variance to chance variance
reporting results - parentheses = df
z-test: z= -2.38, p<0.05
1 sample t-test: t(24) = 2.75, p< 0.05
related and independent samples t-tests: t(11) = 4.95, p<0.05
one-way and two-way ANOVA: f(2,9) = 6.45, p=0.018, n²=0.59
P’s CC: r(4) = 0.902, p=0.014
advantages with related samples designs
avoids participant to participant variability, only analyzes difference scores, requires fewer participants
disadvantages with related samples designs
order effects (one condition may produce varied results from being first or second- practice, fatigue, sensitization)
carry-over effects (effect of earlier treatment linger)
homogeneity of variance
the variance of each population being represented is similar (for independent sample t-tests)
for independent samples t-tests
having very different ns (number of people in each group) results in lower power
factors
independent variables in ANOVA
residuals in JASP
error values
k
number of condition
why use ANOVA instead of lots of t-tests?
more t-tests increases the odds of type 1 errors, ANOVA reduces
familywise error rate
with multiple t-tests, it is much larger than alpha. ANOVA keeps it equal to alpha
post-hoc tests
reveal which values are significantly different
ANOVA assumptions
must use interval or ratio data
scores in each population are normally distributed
all groups have similar variance (homegeneity of variance)
partitioning variance
ANOVA looks at ratio of within group variance to between group variance (MSerror vs MSgroup)
Mean Squared Deviation
MS, similar to variance
MSerror
within group variance
MSgroup
variance between groups
MSgroup+MSerror= total variance
variability among all the scores in a data set
when null is true
Fobtained =1w
when null is false
Fobt>1
Fobt=
(treatment effect + chance)/ chance
ANOVA summary table
Sum of squares: Σ(x -x̄)²
dfgroup= k-1
dferror= k(n-1)
dftotal= N-1
MSgroup= SSgroup/dfgroup
MSerror= SSerror/dferror
Fobt= MSgroup/ MSerror