1/101
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
reliable really just means..
consistency
3 main reliability measures
test-retest, interrater, and internal
test-retest reliability (4)
reliability measure where the same test is administered on two occasions to determine consistency, "scores at T1 should = T2", scatterplot can display, good = r of .7 or greater
interrater reliability (4)
reliability measure where consistent scores are obtained no matter who measures/observes, scatterplot can display, good = r of .7+ AND percent agreement of 85%+
how can you increase interrater reliability? (3)
through practice, training, and clear instructions
percent agreement vs correlation in interrater reliability
percent agreement- the exact percentage of identical ratings between raters, categorical/qualitative measurements (nominal)
correlation- how correlated the patterns are and how interchangeable different raters are overall, scaled/quantitative measurements (ordinal, interval, continuous)
In interrater reliability, percent agreement is to ___ and ___, as correlation is to ___ and ___
categorical and qualitative, scaled and quantitative
internal reliability/consistency (3)
reliability measure that determines how consistently different items on a test measure the same construct, related to construct validity, good = Cronbachs Alpha is .70+
internal reliability/consistency example
Are ALL questions on a self-esteem scale acually related to self-esteem?
What other concept is internal reliability/consistency related to?
Content validity
Cronbachs Alpha, 4 dif levels
.9= excellent, .8= good, .7= acceptable, .5 and under= unacceptable
validity
extent to which something actually measures what it's supposed to measure, essentially accuracy
2 subjective ways to asses validity
face validity and content validity
face validity
does it look like what your trying to measure?
content validity (2Qs)
does the measure contain all the parts your theory says it should contain? does it cover all aspects of the constrcut?
How does Content Validity differ from internal reliability/consistency?
Content validity is a matter of item RELEVANCE, while Internal reliability is a matter of item CONSISTENCY
3 empirical measures of validity
Criterion, convergent, and divergent
Criterion validity (3)
do ppls scores correlate w/ other key behaviors/variables we would expect them to correlate w/?, measure should predict behavioral outcome, known groups paradigm
2 types of Criterion-related validity
concurrent and predictive
Concurrent validity (3)
type of criterion-related validity, compares new tests results with an established "gold-standard" test results
Predictive Validity
type of Criterion-related validity, measures how well a test predicts a future later-measured outcome (that it should predict well)
Predictive Validity example
SAT scores predicting future college GPA
Criterion validity example
a depression inventory's scores should positively correlated with depression diagnoses
Known Groups Paradigm
method for establishing criterion validity by comparing scores with distinct groups already knwon to differ on teh variable
known-groups paradigm example
new depression inventory is administered to a group of diagnosed ppl and group of non-diagnosed ppl
Convergent Validity
scores on 2 different measures, that measure the same thing, are consistent
Divergent validity (3)
aka discriminant validity, ensures a test intended to measure 1 thing doesn't accidentally measure another unrelated thing, scores shouldn't correlated with an unrelated concept
How can a measure be reliable but not valid?
When it consistently produces the SAME INCORRECT results, bc measuring wrong construct or systematic errors
Reliable but invalid measure example
A scale that consistently reads 150 over and over but the person actually weighs 200
How can a measure be valid but not reliable?
when it accurately hits the target construct ON AVERAGE but scores are inconsistent across trials
valid but unreliable measure example
A scale that reads 200, 195, 205 when the person is really 200
surveys vs polls
surveys typically involve multiple questions, while polls are usually only 1 question (aiming to gain frequency info about support on an issue)
Open-ended questions pros and cons
pros- rich source of info and better for qualitative research
cons- very broad and hard to code+analyze
Forced-choice questions pros and cons
pros- easy to code+analyze
cons- least info (nominal) and no opportunity for elaboration/detail
Likert scale (4)
the type of forced-choice Q we use on our lab survey, can be between 3-10, prefferably 5-7, use anchors
6 main things to consider when writing well-worded questions
simplicity, leading questions?, double barreled questions?, negations?, floor+ceiling effects?, question order effects?
double barreled questions
Qs that ask about 2+ things but only allow for 1 answer, AVOID
negations (3)
Qs that use negative words/phrasing, reverses statements meaning which can be confusing, words like "no, never, doesn't"
floor+ ceiling effects
considering the effects that the range can have on response accuracy, can skew data inaccurately like if only 3 options may cluster at max/min even tho they feel more in-between
question order effects
the context of the prior questions can influence responses, Ex: domestic violence Q before spanking kids Q results versus opposite order
How can you control for question order effects?
by making different versions of the survey w/dif orders to see if the results differ and proceed accordingly
response sets
tendency for participants to answer Qs in specific consistent patterns, disrupts data reducing survey validity
response set examples (3)
Acquiescence (agreeing with everything), fence-sitting (middle ground), extremity, etc
The tendency for response-sets, like acquiescence and fence-sitting, to occur in data does what?
weakens construct validity
Faking good
Giving answers to inflate how "good" they are, aka socially desirable responding
3 main threats to response accuracy
response sets, faking good, and faking bad
3 main problems with behavioral observations
observer bias, participant reaction bias, and experimenter bias
observer bias
tendency of observers to see what they expect to see, Confirmation bias influenced data recording
Participant reaction bias (3)
aka reactivity, ppl act different when they know they're being observed, 3 main aspects: participant expectancies, participant reactance, and evaluation apprehension
participant expectancies (4)
when they behave in the way they feel they're expected to, demand characteristics, most common+problematic
demand characteristics
cues in an experiment that tell the participant what behavior is expected, think weapons effect
weapons effect
the tendency for aggression to increase bc the mere presence of weapons (even pics of them)
participant reactance
participant acts the opposite of how they think the experimenter wants them to react
evaluation apprehension
ppl feel apprehensive about being evaluated/don't want to be judged as bad, think social desirability
experimenter bias
researcher expectations skew the results of the study, often by making biased observations+treating subjects differently
observer bias + experimenter bias (3)
same things one is just for experiments, both make biased observations but only experimenters treat subjects differently
4 ways of reducing these biases
double-blind procedure, anonymity, cover story, and unobtrusive measures
double-blind procedure (2)
experimental procedure to reduce bias, where neither the experimenter nor the subject knows what group the subjects in
anonymity
reduces bias by making responses untraceable to spec person or condition
cover story (deception)
A false description of the purpose of a study given to participants, used to maintain psychological realism and protect from reactivity
unobtrusive measures (3)
ways of observing people so they do not know they are being studied, ensuring natural behavior is observed, like naturalistic observation
Is observation ethical?
it depends on spec situation
When do we not need consent to observe ppl?
if observed in public place and anonymity is protected
random sampling vs random assignment
Random sampling- how participants are selected as a sample of their population
Random assignment- participants are randomly assigned to dif groups
random sampling affects ___ validity, while random assignment affects ___ validity
external, internal
3 types of probabilistic samples
simple random, stratified random, and cluster
simple random sampling (def+3)
every member of the population has equal probability of being selected, sampling frame=list of everyone in pop, tecs= systematic+random # table
stratified random sampling
Population divided into subgroups (strata) and random samples taken from each strata
stratified random sampling example
A researcher wants to study GPA of SDSU students, Separates students into majors, Randomly selects from the majors
cluster sampling
when random sampling isn't possible, divide pop into clusters and randomly select clusters
cluster sampling example
A researcher wants to survey math performance of students, She divides the entire population into clusters by school district, then selects entire school districts randomly for her research.
systematic sampling technique
type of simple random, every nth individual on pop list is selected
systematic sampling technique example
every 5th person who walks into the grocery store
multistage sampling technique
good type of cluster sampling, select sub-clusters within clusters
multistage sampling technique example
randomly select five hospitals from county, then randomly select 50 health care workers from each of the 5 hospitals
oversampling
A form of probability sampling, type of stratified random sampling in which the researcher intentionally overrepresents one or more groups.
Oversampling example
10% of sample is prisoners when they're only 2% of population
When working with a probability sample, you ___ ___ how much ___ ___ is in sample data
can estimate, sampling error
sampling error
samples aren't going to match pop exactly, this = the margin of error (E) which is used to create confidence intervals
margin of error
accounts for the percentage difference in accuracy that is due to sampling error, confidence intervals
margin of error equation
ME = 2√(s^2/n)((N-n)/N)
really 1.96 not 2
Confiedence interval
shows 95% probability that the average ___ is x +/- ME (sample average plus/minus margin of error)
[x-ME, x+ME] = confidence interval
When are non-probability samples used?
when it's impossible, impractical, or unnecessary to obtain a probability sample
non-probability sample limitations (3)
researchers have no way of knowing the probability of a spec case being sampled, how representative the sample is, or Margin of error
non-probabilistic sampling example
animal research- animals raised for research so not representative, also most research conducted on college campuses
non-probabilistic sample decreases ___ validity (___)
external, generalizability
non-probability sampling types
convenience, quota, purposive (snowball), and self-selection
convenience sampling (3)
non-prob type, Researchers uses whatever Ss are readily available, ex: class survey
quota sampling (3)
non-prob convenience subtype, Researcher takes steps to ensure that certain kinds of Ss are obtained in particular proportions, ex: 50 men + 50 women
purposive sampling
non-prob type, researcher uses judgement to decide which respondents to include in the sample, aka snowball sampling
purposive sampling example
interviewing only expert wine tasters for product feedback
self-selection sampling
sampling only those who volunteer
Random assignment is only used when?
w/ experimental designs
bivariate correlation
associations that involve exactly two variables
strong correlations allow us to...
predict behavior
What makes a study correlational?
having 2 measured variables, none manipulated
How do we quantitatively describe associations?
correlational strength/coefficient= r
levels of correlational strength quantified
small= 0.0-0.3
medium= 0.31-0.7
large= 0.71-1.0
ALL +/-
associations between 2 quantitative variables (describing and graphing)
scatterplot and correlation coef (r)
associations when 1 variable is categorical (describing & graphing)
scatterplot or bar graph, t-test (average group difference)