Psych Stats I [exam 2]

0.0(0)
studied byStudied by 3 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/105

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

106 Terms

1
New cards

self-plagiarism

to take previously published work and publish it in a "new" paper

2
New cards

outright fraud

making up data - ex: Stapel

  • photoshopping Western blots

3
New cards

replication crisis

  • peer reviewers just assume data is real and stats were run ethically

  • big percentages of research is not replicable

4
New cards

reasons to publish

  • pressure to publish for advancement

  • grant money competition

  • tenure is based on quantity

  • new findings > strong findings

  • peer review isn't suited to catching questionable research practices

5
New cards

false positives

  • running studies with increasing N until we get a sig. result

  • stopping participants when we get a significant result

  • deciding to exclude data after analysis

6
New cards

what can be done?

  • make data available to public

  • change incentives so it's quality>quantity

  • make materials known so it's easier to replicate

  • preregistration of hypothesis and method section

7
New cards

sampling error

difference between the mean of the population and the mean of the sample

  • SD of the sampling distribution is SE

  • why we need confidence intervals

8
New cards

sampling distribution (distribution of means)

  • represents all possible samples in a population of the same size

  • normal curve

  • SE is affected by variance in population and sample size

9
New cards

confidence interval

  • a range of numbers along with the percentage confidence that the pop mean lies within that

  • less variability or confidence -> narrower CI

  • the higher the confidence -> wider CI and larger SE

  • common values are 95% and 99%

10
New cards

alpha

the probability that the confidence interval does NOT contain the pop. mean

  • think of it as the tails of the normal curve

11
New cards

how to do CI

  • level of confidence expressed as alpha - 1

  • ex: if CI is 95%, then alpha = 0.05

  • divvy up the alpha into two, half for above and half for below

  • find z score for that (1.96 corresponds to 0.025 and 0.975)

12
New cards

equation of CI

  • x bar is the point estimate (sample mean)

  • z alpha/2 is the critical value (1.96 for 95% or 2.58 for 99%)

  • SD/sqrtN is the SE

  • what comes after the +/- is the margin of error

13
New cards

degrees of freedom

  • number of scores that can vary in the calculation of a statistic

  • ex: the mean is based on scores in the sample - one of the scores is not free to vary 8 = (9+10+7+6)/4 we could be missing any of these numbers and still figure it out

14
New cards

t distribution

  • as N increases, t becomes more like z (close to 1000)

  • z distribution only works when we know sigma (pop SD)

  • since more scores are in the tails, SE increases, causing us to need to widen the CI

  • still use N for SE (not N-1), use N-1 for sample SD

  • ex: for alpha = 0.05, t = 2.064 when df = 24 but t = 2.132 when df = 15

15
New cards

when to use t-distribution

  • pop SD is unknown

  • small N

  • t = alpha/2 of the degrees of freedom

16
New cards

construct validity

the extent to which a measuring instrument accurately measures what they are supposed to measure

  • includes convergent validity and discriminant validity

17
New cards

convergent validity

determine extent to which our measure correlates with other measures of the same construct

  • how well measures of observables "go together"

  • "does the test correlate with other tests that measure the same or related constructs?"

18
New cards

discriminant validity

determine the extent to which our measure correlates with other measures of the same construct

  • "does the test NOT or negatively correlate with other tests that measure the different/opposite constructs?"

19
New cards

concurrent validity

test correlates with criterion measured at the same time

20
New cards

criterion validity

ability of measuring instrument to estimate or PREDICT some criterion behavior that is external to the measuring instrument itself

  • includes concurrent validity, predictive validity, and postdictive validity

21
New cards

predictive validity

test taken now predicts future behavior ex: does the driving test predict driving ability or the marshmallow test

22
New cards

postdictive validity

test taken now captures behavior after it occurred ex: give older drivers the driving test and look BACK at driving record

23
New cards

known groups paradigm

a method for establishing criterion validity, in which a researcher tests two or more groups, who are known to differ on the variable of interest, to ensure that they score differently on a measure of that variable

24
New cards

content validity

extent to which a measuring instrument covers a representative sample of the behaviors to be measured

  • includes face validity and assessment by subject matter experts (SME)

25
New cards

face validity

does it "feel" like we answered the question

  • quite subjective

26
New cards

assessment by SMEs in that content area

does the test contain the items from the desired "content domain"

  • esp important when something has low face validity

27
New cards

external validity

extent to which the results of an experiment can be generalized and extend beyond the lab

  • mundane realism and experimental realism

28
New cards

threats to external validity

  • the sophomore college problem (participants are not generalizable to the population)

  • WEIRD populations (white, educated, industrialized, rich, and democratic)

29
New cards

mundane realism

the degree of similarity between the situation created in the lab and the one that people may encounter in the "real world"

  • they guy that made his lab look like a bar had a lot of mundane realism

30
New cards

experimental realism

the degree of similarity between the psychological experience in the lab and that of the real world

  • more important than mundane realism

31
New cards

internal validity

extent to which the results of a study can be attributed to the variables that the research purports to be the independent variables, rather than some other confounding variables

  • "given the relationship between the IV and DV, is it likely that the former caused the latter?"

  • ex: coffee drinkers may smoke more cigarettes than non-coffee drinkers, so smoking is a confounding variable in the study of the association between coffee drinking and heart disease

  • kind of like a mediator?? like with the shark attacks and ice cream sales

  • MRS SMITH pose threats to internal validity

32
New cards

confounding variables

an extraneous variable that systematically changes along with the variable hypothesized to be a causal variables

  • clever Hans the horse and the trainer that said he could do math

33
New cards

threats to internal validity

  • Mortality and attrition

  • Regression to the mean

  • Selection of subjects

  • Selection by maturation interaction

  • Maturation

  • Instrumentation

  • Testing

  • History

34
New cards

mortality (attrition)

loss of participants may lead to results when the participants who remain in the study are no longer representative

35
New cards

maturation

changes within participants (not induced by the research) are responsible for results (usually physiological)

  • aging processes or physiological states like hunger or fatigue

36
New cards

history

some outside event that is not part of the research is responsible for the results

  • could be major events in society or minor events occurring within the experimental situation that account for the results more than the treatment of interest

37
New cards

selection of subjects

any bias in selecting and assigning participants to groups that results in systematic differences between the participants in each group

  • differences exist BEFORE one group is exposed to the experimental group

38
New cards

instrumentation

changes in the measurement procedures may result in differences between the comparison groups that are confused with the treatment effects

  • may get lazier over time

  • switch to a "better way of collecting data"

39
New cards

testing

when participants are repeatedly tested, changes in test scores may be more due to practice or knowledge about the test procedure gained from earlier experiences rather than any treatment effects

  • similar to maturation except the change is caused by the testing procedure itself

40
New cards

selection by maturation interaction

the treatment and no-treatment groups, although similar at one point, would have grown apart (developed differently) even if no treatment had been administered

  • pretest scores may be the same but the groups are not matched as well on other relevant variables that causes them to become different after a period of time

41
New cards

regression to the mean

a threat to internal validity in which extreme scores, upon retesting, tend to be less extreme, moving toward the mean

42
New cards

Covariance

a measure of the extent to which two variables vary together; measures the pattern of deviation scores for two variables (x,y)

43
New cards
<p>Equation of covariance</p>

Equation of covariance

Cov(x,y) = sum of (x-xbar)(y-ybar) / N-1

44
New cards

Limitations of covariance

hard to interpret :(

  • only tells us direction (+, -, or 0), not magnitude

  • as SD for x and y get larger, so does cross-products, thus covariance

  • no standardization

45
New cards

Pearson correlation coefficient

  • standardized covariation

  • tells us direction and magnitude of association

46
New cards

Equation of Pearson

r = sum of Zx*Zy / N-1

<p>r = sum of Zx*Zy / N-1</p>
47
New cards

When do you use Pearson?

  • ratio

  • interval

  • nominal, dichotomous

  • sampling distributions of x and y are approximately normal (N>30?)

  • relationship between x and y is linear

48
New cards

Correlation matrix

a table/matrix showing the pairwise correlations between ALL variables (for simple linear models only)

49
New cards

Interpreting r - what's considered large, moderate, and small r?

  • 0.10 - small

  • 0.30 - moderate

  • 0.50 - large

50
New cards

method error

  • ME = random error + systematic error

  • observed score = true score + ME

51
New cards

random error

an error that occurs when the selected sample is an imperfect representation of the overall population could be due to how your day is going or anything

  • can cancel each other out

52
New cards

systematic error

Error that shifts all measurements in a standardized way. Decreases accuracy. Can result in bias ex: fMRI machine increasing anxiety at first

  • more problematic but easier to fix

53
New cards

reliability

how consistent are you measurements?

  • answers the question "given that nothing else changes, will you get the same results if you repeat the measurement?"

  • true score/true score + ME = reliability

54
New cards

5 ways to test reliability

  1. test-retest reliability

  2. alternate forms reliability

  3. split-half reliability

  4. inter-rater reliability

  5. internal consistency

55
New cards

test-retest rel

  • test at time 1 then at time 2

  • look at correlation coefficients, want r > 0.80

  • pros: easy to do

  • cons: remember the results from time 1 (practice effects), chance of mortality

56
New cards

alternate-forms rel

  • assesses degree of relationship between score on 2 equivalent tests -> get correlation coefficient

  • pros: can't "get good" at taking test, no practice effect

  • cons: may not get at the exact same concepts, does a lower r mean less rel or are there differences in testing?, need to measure twice

57
New cards

split-half rel

splitting the items on a test

  • do the scores on one half correlate with scores on the other half?

  • pros: within a single instrument, no mortality

  • cons: how to divide? are the questions equal?

58
New cards

inter-rater rel

  • reliability coefficient that assesses the agreement made by 2 or more raters or judges

  • pros: no practice effects

  • cons: cannot use Pearson (doesn't detect systematic bias bc it standardizes scores), judges may have bias, use ICC instead

59
New cards

intraclass correlation (ICC)

ICC = level of agreement/level of agreement + sys error

  • needs to be sensitive to order of magnitude, not standardized

60
New cards

internal consistency (most used)

  • each item in a measurement is considered a unique test

  • responses to various items are compared

  • "do people who sore high on one item score high on all items measuring the same construct?"|

  • uses cronbach's alpha

61
New cards

cronbach's alpha

a correlation-based statistic that measures a scale's internal reliability

  • the larger the alpha, the more reliable

  • 0.9 is high

  • 0.8-0.89 is good

  • 0.7-0.79 is acceptable

  • 0.65-0.69 is marginal if low, maybe cut out some items to inc rel

62
New cards

scatterplots

best way to visualize associations between two quantitative variables (each point represents individual score)

63
New cards

what can be observed in a scatterplot?

  • direction of association (+ or - slope) -magnitude of association (closely packed together or not?)

  • linearity (linear or curvilinear?)

  • outliers

64
New cards

Problems to look out for in a scatterplot

  • range restriction

  • heterogenous sub-samples

  • interpretation issues

  • variables that change over time

65
New cards

Heterogenous sub-samples (problem to look out for in a scatterplot)

situations where there are groups in a sample who changes association; aka "moderators"

  • ex: dosage of drug vs. recovery (stage of disease may be a moderator)

66
New cards

Interpretation issues (problem to look out for in a scatterplot)

correlations usually don't explain causation

  • ex: number of fire hydrants correlates with number of dogs (third variable problem & reverse causality problem)

67
New cards

Variables that change over time (problem to look out for in a scatterplot)

strong correlation but not causal! change over time (year, time) will correlate.

  • ex: preservative in vaccines reported to increase autism in children

68
New cards

Range restriction (problem to look out for in a scatterplot)

cases where the range over which x and y varies is artificially limited

  • ex: when studying income, people on the low and high ends may not volunteer so we end up with the people in the middle

69
New cards

Pearson correlation coefficient - APA results section

  • direction of association (negative or positive correlation; what does that mean?)

  • magnitude of association (small, moderate, or large)

  • p-value

70
New cards

When do you use Spearman correlation coefficient?

  • small sample size (<30) that normality cannot be assumed

  • monotonic curvilinear

  • ordinal (rank data)

71
New cards

Equation of Spearman correlation coefficient

r-squared = 1 - (6*sum of D-squared / n(n-squared - 1))

<p>r-squared = 1 - (6*sum of D-squared / n(n-squared - 1))</p>
72
New cards

Monotonic curvilinear

one type of change

73
New cards

When would you have ordinal data?

  • data are heavily skewed

  • data not measured on interval scale, but ordered

  • rank is primarily interest

74
New cards

Simple regression

modeling the influence of one variable on an outcome

75
New cards

Multiple regression

modeling the influence of multiple variables on an outcome

  • forms of multiple regression to test specific hypothesis: mediation and moderation

76
New cards

How do you know if the form of the relationship is linear (positive, negative), or curvilinear?

  • theory, past research

  • scatterplot using pilot data

77
New cards

Parameter estimation

process of determining the slope and y-intercept

78
New cards

yhat

predicted value of y for the equation

79
New cards

y - yhat

error in prediction; total amount of error in prediction

80
New cards

Sum of squared error (SSE)

SSE = sum of (y - yhat)^2

81
New cards

Error variance

average of squared errors sum of (y - yhat)^2 / N

82
New cards

Ways to estimate parameters of linear equations

  • Brute Force (play around with slopes)

  • Analytic methods

83
New cards

3 objectives of regression

  1. to describe the linear relationship between two continuous variables, x and y

  2. to predict new values of y from new values of x

  3. to evaluate how good our predictions are (-> use error variance!)

84
New cards

pitfalls for using regression in prediction

  1. relationship between x and y may not be linear at all

  2. outliers

  3. heteroskedasticity

  4. multicollinearity

  5. moderation/subgroups

  6. extrapolation

85
New cards

heteroskedasticity

standard deviations of a predicted variable, monitored over different values of an independent variable or as related to prior time periods, are non-constant

86
New cards

multicollinearity

Multicollinearity is the occurrence of high intercorrelations among two or more independent variables in a multiple regression model

87
New cards

Calculating regression line

  1. calculate mean and SD for x and y

  2. solve for r (Pearson Correlation Coefficient)

  3. solve for b (slope)

  4. solve for a (y-intercept)

  5. find the regression equation

88
New cards

Standardized beta coefficient

B1 = b1 * Sx/Sy

  • B1 = r

89
New cards

Analytic least-squared regression

b = r * Sy/Sx how to find slope (line of best fit) when we have imperfect relationship

90
New cards

Proportion of variance in y that is unexplained by the model (badness of model)

Residual error relative to the total variance of y sum of (y - yhat)^2 / sum of (y - ybar)^2

91
New cards

Proportion of variance in y that is explained by the model (goodness of model)

1 - (sum of (y - yhat)^2 / sum of (y - ybar)^2) = R²

92
New cards

Breaking up the variation in y

sum of (y - ybar)^2 = sum of (yhat - ybar)^2 + sum of (y - yhat)^2

<p>sum of (y - ybar)^2 = sum of (yhat - ybar)^2 + sum of (y - yhat)^2</p>
93
New cards

R-squared

  • represents the proportion of the variance in y that is accounted for by the model

  • used to compare/evaluate MODELS

  • R^2 = 0 means model is not explaining any of the variability in the outcome

  • R^2 = 1 means model is perfect and accounts for all variability

94
New cards

r and R^2: is there a relationship?

in simple linear model, R^2 = r^2

95
New cards

In multiple regression, use ___ to evaluate the overall model; use ___ to evaluate individual predictors

R^2; beta

96
New cards

Why is R^2 useful?

  • R^2 is a standard metric for interpreting model fit

  • used to compare/evaluate models

97
New cards

Regression toward mediocrity

when you have extreme cases, subsequent generations tend to be less extreme

98
New cards

At what sample size do regression coefficients stabilize?

250 samples

99
New cards

Mediation

a variable that explains the relationship between predictor variable and outcome

100
New cards

Moderation

a variable that changes the direction or strength of the relation between a predictor and an outcome