1/105
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
self-plagiarism
to take previously published work and publish it in a "new" paper
outright fraud
making up data - ex: Stapel
photoshopping Western blots
replication crisis
peer reviewers just assume data is real and stats were run ethically
big percentages of research is not replicable
reasons to publish
pressure to publish for advancement
grant money competition
tenure is based on quantity
new findings > strong findings
peer review isn't suited to catching questionable research practices
false positives
running studies with increasing N until we get a sig. result
stopping participants when we get a significant result
deciding to exclude data after analysis
what can be done?
make data available to public
change incentives so it's quality>quantity
make materials known so it's easier to replicate
preregistration of hypothesis and method section
sampling error
difference between the mean of the population and the mean of the sample
SD of the sampling distribution is SE
why we need confidence intervals
sampling distribution (distribution of means)
represents all possible samples in a population of the same size
normal curve
SE is affected by variance in population and sample size
confidence interval
a range of numbers along with the percentage confidence that the pop mean lies within that
less variability or confidence -> narrower CI
the higher the confidence -> wider CI and larger SE
common values are 95% and 99%
alpha
the probability that the confidence interval does NOT contain the pop. mean
think of it as the tails of the normal curve
how to do CI
level of confidence expressed as alpha - 1
ex: if CI is 95%, then alpha = 0.05
divvy up the alpha into two, half for above and half for below
find z score for that (1.96 corresponds to 0.025 and 0.975)
equation of CI
x bar is the point estimate (sample mean)
z alpha/2 is the critical value (1.96 for 95% or 2.58 for 99%)
SD/sqrtN is the SE
what comes after the +/- is the margin of error
degrees of freedom
number of scores that can vary in the calculation of a statistic
ex: the mean is based on scores in the sample - one of the scores is not free to vary 8 = (9+10+7+6)/4 we could be missing any of these numbers and still figure it out
t distribution
as N increases, t becomes more like z (close to 1000)
z distribution only works when we know sigma (pop SD)
since more scores are in the tails, SE increases, causing us to need to widen the CI
still use N for SE (not N-1), use N-1 for sample SD
ex: for alpha = 0.05, t = 2.064 when df = 24 but t = 2.132 when df = 15
when to use t-distribution
pop SD is unknown
small N
t = alpha/2 of the degrees of freedom
construct validity
the extent to which a measuring instrument accurately measures what they are supposed to measure
includes convergent validity and discriminant validity
convergent validity
determine extent to which our measure correlates with other measures of the same construct
how well measures of observables "go together"
"does the test correlate with other tests that measure the same or related constructs?"
discriminant validity
determine the extent to which our measure correlates with other measures of the same construct
"does the test NOT or negatively correlate with other tests that measure the different/opposite constructs?"
concurrent validity
test correlates with criterion measured at the same time
criterion validity
ability of measuring instrument to estimate or PREDICT some criterion behavior that is external to the measuring instrument itself
includes concurrent validity, predictive validity, and postdictive validity
predictive validity
test taken now predicts future behavior ex: does the driving test predict driving ability or the marshmallow test
postdictive validity
test taken now captures behavior after it occurred ex: give older drivers the driving test and look BACK at driving record
known groups paradigm
a method for establishing criterion validity, in which a researcher tests two or more groups, who are known to differ on the variable of interest, to ensure that they score differently on a measure of that variable
content validity
extent to which a measuring instrument covers a representative sample of the behaviors to be measured
includes face validity and assessment by subject matter experts (SME)
face validity
does it "feel" like we answered the question
quite subjective
assessment by SMEs in that content area
does the test contain the items from the desired "content domain"
esp important when something has low face validity
external validity
extent to which the results of an experiment can be generalized and extend beyond the lab
mundane realism and experimental realism
threats to external validity
the sophomore college problem (participants are not generalizable to the population)
WEIRD populations (white, educated, industrialized, rich, and democratic)
mundane realism
the degree of similarity between the situation created in the lab and the one that people may encounter in the "real world"
they guy that made his lab look like a bar had a lot of mundane realism
experimental realism
the degree of similarity between the psychological experience in the lab and that of the real world
more important than mundane realism
internal validity
extent to which the results of a study can be attributed to the variables that the research purports to be the independent variables, rather than some other confounding variables
"given the relationship between the IV and DV, is it likely that the former caused the latter?"
ex: coffee drinkers may smoke more cigarettes than non-coffee drinkers, so smoking is a confounding variable in the study of the association between coffee drinking and heart disease
kind of like a mediator?? like with the shark attacks and ice cream sales
MRS SMITH pose threats to internal validity
confounding variables
an extraneous variable that systematically changes along with the variable hypothesized to be a causal variables
clever Hans the horse and the trainer that said he could do math
threats to internal validity
Mortality and attrition
Regression to the mean
Selection of subjects
Selection by maturation interaction
Maturation
Instrumentation
Testing
History
mortality (attrition)
loss of participants may lead to results when the participants who remain in the study are no longer representative
maturation
changes within participants (not induced by the research) are responsible for results (usually physiological)
aging processes or physiological states like hunger or fatigue
history
some outside event that is not part of the research is responsible for the results
could be major events in society or minor events occurring within the experimental situation that account for the results more than the treatment of interest
selection of subjects
any bias in selecting and assigning participants to groups that results in systematic differences between the participants in each group
differences exist BEFORE one group is exposed to the experimental group
instrumentation
changes in the measurement procedures may result in differences between the comparison groups that are confused with the treatment effects
may get lazier over time
switch to a "better way of collecting data"
testing
when participants are repeatedly tested, changes in test scores may be more due to practice or knowledge about the test procedure gained from earlier experiences rather than any treatment effects
similar to maturation except the change is caused by the testing procedure itself
selection by maturation interaction
the treatment and no-treatment groups, although similar at one point, would have grown apart (developed differently) even if no treatment had been administered
pretest scores may be the same but the groups are not matched as well on other relevant variables that causes them to become different after a period of time
regression to the mean
a threat to internal validity in which extreme scores, upon retesting, tend to be less extreme, moving toward the mean
Covariance
a measure of the extent to which two variables vary together; measures the pattern of deviation scores for two variables (x,y)
Equation of covariance
Cov(x,y) = sum of (x-xbar)(y-ybar) / N-1
Limitations of covariance
hard to interpret :(
only tells us direction (+, -, or 0), not magnitude
as SD for x and y get larger, so does cross-products, thus covariance
no standardization
Pearson correlation coefficient
standardized covariation
tells us direction and magnitude of association
Equation of Pearson
r = sum of Zx*Zy / N-1
When do you use Pearson?
ratio
interval
nominal, dichotomous
sampling distributions of x and y are approximately normal (N>30?)
relationship between x and y is linear
Correlation matrix
a table/matrix showing the pairwise correlations between ALL variables (for simple linear models only)
Interpreting r - what's considered large, moderate, and small r?
0.10 - small
0.30 - moderate
0.50 - large
method error
ME = random error + systematic error
observed score = true score + ME
random error
an error that occurs when the selected sample is an imperfect representation of the overall population could be due to how your day is going or anything
can cancel each other out
systematic error
Error that shifts all measurements in a standardized way. Decreases accuracy. Can result in bias ex: fMRI machine increasing anxiety at first
more problematic but easier to fix
reliability
how consistent are you measurements?
answers the question "given that nothing else changes, will you get the same results if you repeat the measurement?"
true score/true score + ME = reliability
5 ways to test reliability
test-retest reliability
alternate forms reliability
split-half reliability
inter-rater reliability
internal consistency
test-retest rel
test at time 1 then at time 2
look at correlation coefficients, want r > 0.80
pros: easy to do
cons: remember the results from time 1 (practice effects), chance of mortality
alternate-forms rel
assesses degree of relationship between score on 2 equivalent tests -> get correlation coefficient
pros: can't "get good" at taking test, no practice effect
cons: may not get at the exact same concepts, does a lower r mean less rel or are there differences in testing?, need to measure twice
split-half rel
splitting the items on a test
do the scores on one half correlate with scores on the other half?
pros: within a single instrument, no mortality
cons: how to divide? are the questions equal?
inter-rater rel
reliability coefficient that assesses the agreement made by 2 or more raters or judges
pros: no practice effects
cons: cannot use Pearson (doesn't detect systematic bias bc it standardizes scores), judges may have bias, use ICC instead
intraclass correlation (ICC)
ICC = level of agreement/level of agreement + sys error
needs to be sensitive to order of magnitude, not standardized
internal consistency (most used)
each item in a measurement is considered a unique test
responses to various items are compared
"do people who sore high on one item score high on all items measuring the same construct?"|
uses cronbach's alpha
cronbach's alpha
a correlation-based statistic that measures a scale's internal reliability
the larger the alpha, the more reliable
0.9 is high
0.8-0.89 is good
0.7-0.79 is acceptable
0.65-0.69 is marginal if low, maybe cut out some items to inc rel
scatterplots
best way to visualize associations between two quantitative variables (each point represents individual score)
what can be observed in a scatterplot?
direction of association (+ or - slope) -magnitude of association (closely packed together or not?)
linearity (linear or curvilinear?)
outliers
Problems to look out for in a scatterplot
range restriction
heterogenous sub-samples
interpretation issues
variables that change over time
Heterogenous sub-samples (problem to look out for in a scatterplot)
situations where there are groups in a sample who changes association; aka "moderators"
ex: dosage of drug vs. recovery (stage of disease may be a moderator)
Interpretation issues (problem to look out for in a scatterplot)
correlations usually don't explain causation
ex: number of fire hydrants correlates with number of dogs (third variable problem & reverse causality problem)
Variables that change over time (problem to look out for in a scatterplot)
strong correlation but not causal! change over time (year, time) will correlate.
ex: preservative in vaccines reported to increase autism in children
Range restriction (problem to look out for in a scatterplot)
cases where the range over which x and y varies is artificially limited
ex: when studying income, people on the low and high ends may not volunteer so we end up with the people in the middle
Pearson correlation coefficient - APA results section
direction of association (negative or positive correlation; what does that mean?)
magnitude of association (small, moderate, or large)
p-value
When do you use Spearman correlation coefficient?
small sample size (<30) that normality cannot be assumed
monotonic curvilinear
ordinal (rank data)
Equation of Spearman correlation coefficient
r-squared = 1 - (6*sum of D-squared / n(n-squared - 1))
Monotonic curvilinear
one type of change
When would you have ordinal data?
data are heavily skewed
data not measured on interval scale, but ordered
rank is primarily interest
Simple regression
modeling the influence of one variable on an outcome
Multiple regression
modeling the influence of multiple variables on an outcome
forms of multiple regression to test specific hypothesis: mediation and moderation
How do you know if the form of the relationship is linear (positive, negative), or curvilinear?
theory, past research
scatterplot using pilot data
Parameter estimation
process of determining the slope and y-intercept
yhat
predicted value of y for the equation
y - yhat
error in prediction; total amount of error in prediction
Sum of squared error (SSE)
SSE = sum of (y - yhat)^2
Error variance
average of squared errors sum of (y - yhat)^2 / N
Ways to estimate parameters of linear equations
Brute Force (play around with slopes)
Analytic methods
3 objectives of regression
to describe the linear relationship between two continuous variables, x and y
to predict new values of y from new values of x
to evaluate how good our predictions are (-> use error variance!)
pitfalls for using regression in prediction
relationship between x and y may not be linear at all
outliers
heteroskedasticity
multicollinearity
moderation/subgroups
extrapolation
heteroskedasticity
standard deviations of a predicted variable, monitored over different values of an independent variable or as related to prior time periods, are non-constant
multicollinearity
Multicollinearity is the occurrence of high intercorrelations among two or more independent variables in a multiple regression model
Calculating regression line
calculate mean and SD for x and y
solve for r (Pearson Correlation Coefficient)
solve for b (slope)
solve for a (y-intercept)
find the regression equation
Standardized beta coefficient
B1 = b1 * Sx/Sy
B1 = r
Analytic least-squared regression
b = r * Sy/Sx how to find slope (line of best fit) when we have imperfect relationship
Proportion of variance in y that is unexplained by the model (badness of model)
Residual error relative to the total variance of y sum of (y - yhat)^2 / sum of (y - ybar)^2
Proportion of variance in y that is explained by the model (goodness of model)
1 - (sum of (y - yhat)^2 / sum of (y - ybar)^2) = R²
Breaking up the variation in y
sum of (y - ybar)^2 = sum of (yhat - ybar)^2 + sum of (y - yhat)^2
R-squared
represents the proportion of the variance in y that is accounted for by the model
used to compare/evaluate MODELS
R^2 = 0 means model is not explaining any of the variability in the outcome
R^2 = 1 means model is perfect and accounts for all variability
r and R^2: is there a relationship?
in simple linear model, R^2 = r^2
In multiple regression, use ___ to evaluate the overall model; use ___ to evaluate individual predictors
R^2; beta
Why is R^2 useful?
R^2 is a standard metric for interpreting model fit
used to compare/evaluate models
Regression toward mediocrity
when you have extreme cases, subsequent generations tend to be less extreme
At what sample size do regression coefficients stabilize?
250 samples
Mediation
a variable that explains the relationship between predictor variable and outcome
Moderation
a variable that changes the direction or strength of the relation between a predictor and an outcome