PSY Tests & Measurements Exam 1

0.0(0)

Studied by 0 people

0.0(0)

Call with Kai

Knowt Play

New

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/126

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

127 Terms

New cards

What is a psychological test

a procedure or instrument that measures a construct or behavior to make interferences about human attributes, traits or characteristics

New cards

why might items on two IQ tests be quite different?

because the test developers defined IQ differently based on their varying theories of intelligence

New cards

what were the first published tests of mental ability?

the Binet-Simon scale

New cards

What is the Flynn Effect?

the trend that the average IQ score increases with each new generation

New cards

what do self report tests require test takers to do?

to report or describe their feelings, beliefs, opinions and/or mental states.

New cards

Key assumptions of Psychological Tests

An individual’s behavior and, therefor,e test scores will typically remain stable over time
Psychological tests measure what they say they measure
test takers will report accurately about themselves
- Test items are understood the same way

New cards

what is not an assumption that test users make about psychological tests?

test scores are 99.9% accurate with little or no error

New cards

What is race norming?

ranking a minority test taker higher than a White test taker with the same test score

New cards

What type of tests are the Rorschach Inkblot Test and the Thematic Apperception Test?

Projective tests

New cards

Binet’s psychological tests were designed to evaluate

Children

New cards

What do all psychological tests have in common?

Using evidence to reach conclusions

New cards

Three defining characteristics of good tests

Representatively sample the behaviors thought to measure a construct
Behavior samples are obtained under standardized conditions (test must be administered the same way to all people)
Have rules for scoring to ensure consistency

New cards

Maximal Performance

classification of test by behavior
test takers perform well defined task (eg IQ, driving tests) and try to do their best

New cards

Behavior Observation

classification of test by behavior performed
involves observing people’s behavior in a particular context, often without them knowing

New cards

Self report

classification of test by behavior performed
test takers describe their own feelings, beliefs, or opinons

New cards

Standardized tests

classification of test by standardization
administered to a large group a standardization sample) to create norms for score comparison
- specific directions for administration and scoring

New cards

Nonstandardized tests

classification of tests by standardization
more informal, often for single administration (i.e: do not have standardization sample)

New cards

Objective tests

classification of test by scoring method
have predetermined correct answers and require little subjective judgment to score.
structured formates like MC, T/F, or rating scales

New cards

Projective tests

classification of test by scoring method
test takers respond to ambiguous stimuli (i.e: Rorschach inkblots, Thematic Apperception Test)
scores involve subjective judgment

New cards

Achievement tests

by dimension measured
measures previous learning in a specific academic area

New cards

Aptitude tests

by dimension measured
assess potential for learning or ability to perform in a new situation

New cards

Intelligence tests

by dimension measured
assess the ability to cope with the environment at a broad level

New cards

Personality tests

classification of test by dimension measured
measures human character or disposition

New cards

Interest inventories

classified by dimension measured
assess interests to help with career decisions

New cards

Psychological assessment

broad process of gathering information about an individual using multiple methods, including interviews, observations, and psychological tests
one tool in this process: psychological test

New cards

Measurement

process of assigning numbers to attributes accroding to specific rules

broader concept than a test

New cards

Survey

focuses on group outcomes and reports results at the question level (such as percentages) P

New cards

Psychological test

focuses on individual outcomes and provides an overall derived score or scaled scores

New cards

What are some key historical developments of creating psychological test?

created to screen emotional instability during war
IQ test for children (Binet-Simon Scale)
developed Army Alpha (literate recruits) & Army Beta (non-literate/non-english speaking) during WWI

New cards

What are some major controversies of psychological tests during it’s development

discrimination against racial, economic or cultural groups
nature v.s nurture: IQ; be different tests because developers defined IQ differently based on theories
Within-group norming: race norming
Flynn Effect

New cards

Flynn Effect

observation that average IQ scores have been increasing with each new generation

due to changes in how new generations think (“mental artillery”)

New cards

Race Norming

within group norming
practice of administering the same test to every test taker but scoring test differently according to race of the test taker
Outlawed by Civil Rights Act of 1991

New cards

Nominal measurement

numbers are used as labels for categories of data; just naming
statistical analysis to use: Frequency, Mode, Chi-square
ex: 1= democrat, 2= republican

New cards

Ordinal

numbers are used to rank order data, but the interval between the ranks ARE NOT equal or can vary
statistical analysis to use: median, percentile, rank-order correlation
i.e, class rank, Likert scales, grade equivalents

New cards

Likert scales are seen and treat as what measurement and why?

Ordinal or interval but are treated as interval scales assuming that each point on the rating scale represents an equal distance or amount of the construct being measured

New cards

Interval measurement

numbers are rank ordered with equal distances between them, but there is no absolute zero
statistical analysis to use: mean, standard deviation, correlation, t-test, ANOVA

New cards

Ratio measurement

numbers are rank ordered with equal distances between them but there is a true meaningful zero point
statistical analysis to use: all parametric analyses

New cards

Frequency distributions

orderly arrangment of scores showing the number or percentage of observations within a range/category

displayed as histogram sometimes

New cards

Normal (Bell) Curve

symmetrical bell shaped theoretical distribution where most scores cluster near the middle (mean)

shaped determined by mean and SD

New cards

With a smaller standard deviation what would the normal curve then look like?

narrow and tall

New cards

measures of central tendency

describes middle of a distribution
mean, median, mode

New cards

mean

μ or xˉ
average, best for symmetrical distributions ,but is impacted by outliers
- unusually high or low scores

New cards

median

middle score when all scores are ordered
not impacted by outliers and better for skewed distributions

New cards

mode

most frequently occurring score in a distribution

New cards

measures of variability

describes how spread out the scores are
range, variance, standard deviation

New cards

range

highest score in a distribution minus the lowest score

New cards

variance

σ²
indicates whether individual scores tend to be similar to or substantially different from the mean

New cards

standard deviation

σ
most commonly used measure of variability
square root of variance
allows us to understand how scores are distributed around the mean in a normal curve

New cards

when the tail of a bell curve is to the right side

it is positively skewed

median is smaller than mean

New cards

when the tail of a bell curve is to the left side

it is negatively skewed

median is higher than mean

New cards

approx. 68% of scores fall within ± __ SD of the mean

± 1 SD

New cards

approx. 95% of scores fall within ± _ SD of the mean

± 2 SD

New cards

approx. 99.7% of scores fall within ± _ SD of the mean

± 3 SD

New cards

measure of relationship

describes distributions of test scores
must have at least two sets or distribution of scores to calculate this
correlation coefficient

New cards

correlation coefficient

describes r/s between two or more distribution of scores
whether the same individuals scored similarity on two different tests
measured on interval or ratio scale
-1.0 to +1.0

<ul><li><p>describes r/s between two or more distribution of scores </p></li><li><p>whether the same individuals scored similarity on two different tests </p></li><li><p>measured on interval or ratio scale </p></li><li><p>-1.0 to +1.0</p></li></ul><p></p>

New cards

positive correlation coefficient

r > 0
- one score increases the other tends to increase

<ul><li><p>r > 0 </p><ul><li><p>one score increases the other tends to increase </p></li></ul></li></ul><p></p>

New cards

negative correlation coefficient

r < 0
- as one score increases, the other tends to decrease

<ul><li><p>r < 0</p><ul><li><p>as one score increases, the other tends to decrease</p></li></ul></li></ul><p></p>

New cards

zero correlation coefficient

r = 0
no relationship

<ul><li><p>r = 0 </p></li><li><p>no relationship</p></li></ul><p></p>

New cards

perfect positive correlation

New cards

strong positive correlation

New cards

weak positive correlation

New cards

weak negative correlation

New cards

strong negative correlation

New cards

perfect negative correlation

New cards

what is the formula for standard deviation (for a population)?

find deviation o each score from the mean (x-µ)²
sum the squared deviations ∑(x-µ)²
divide by N to get variance (σ²)
take square root of variance

New cards

what is the formula for standard deviation (for a sample)?

New cards

reliability

consistency of test scores
essential standards for determining how trustworthy data derived from a psychological test are
trust to measure each person and construct in approximately the same way every time it is used
- contains some errors

New cards

what can impact a person’s measured score?

measurement errors such as mistakes of test taker or test administers, response bias, changes in environmental conditions, flaw or inaccuracy in measuring instrument etc

New cards

what makes test reliable?

measures each person in approximately the same way each time it is used

produces consistent results when applied multiple times times or in different circumstances

New cards

Classical Test Theory

every observed score (X) is composed of a true score (T) and a random error score (E)
X = T + E
error will create normal distribution

New cards

what are the two types of error score (measurement error)?

random error
systematic error

New cards

random error

variability in test scores that is due to unpredictable and uncontrollable factors which lowers reliability of test
normally distributed & uncorrelated with true score
- environmental conditions, temporary distractions, fluctuations in individuals’ performance

New cards

systematic error

when a single source of error consistently increases or decreases the true score by the same amount
can be difficult to identify which distorts the real score
- A bathroom scale that always reads 3 lbs higher

New cards

three main categories of methods to estimate reliability/precision of the test

test-retest method
alternate forms method
internal consistency methods

New cards

test-retest method

test developers gives the same test to the same group of test takers on two different occasions and compared using correlation from the first and second administration to examine the stability of test scores over time
limitations: practice effects

New cards

practice effects

test takers benefits from taking the test the first time due to practice which enables them to solve problems more quickly and correctly the second time

New cards

alternate- forms method

test developers create two different forms of the test to be as alike as possible to the same people to measure the equivalence of the forms
scores are compared using correlation
overcomes practice effects but has order effects

New cards

order effects

changes in scores resulting from the order the test were taken
avoid this by having half test takers receiving form A and the other form B

New cards

internal consistency methods

A single test administration is used to see how related the items (or group of items) on the test are to one another

How a person answered one item on the test would give you information that would help you correctly predict how they answered another item on the test
coefficent alpha

New cards

coefficient alpha

Cronbach’s alpha = internal consistency coefficient
it items are truly the same construct naturally should be correlated with one another
only appropriate for homogenous test (measuring one trait or characteristic)
ranges 0.00-1.00 (perfeclty relaibile)
higher value = greater consistency
median: .85

New cards

how can the test itself influence reliability?

being poorly designed
ambiguous questions
poorly written questions
require a higher reading level than the level of test takers

New cards

how can the test administration influence reliability?

when directions are not followed
misread instruction for length of time
answer participant questions incorrectly
allow test environment to be hot, cold or noisy
display a negative or uncomfortable attitude

New cards

how can the test scoring influence reliability?

not conducted accurately
e.g: WAIS similarity test item what the words apple and orange have in common?

New cards

how can the test takers influence reliability?

contribute to test error
fatigue
illness
exposure to the test questions or research questions before the test
social desirability

New cards

what are the steps of test development?

Define the testing universe, target audience, and test purpose
develop a test plan
compose test items
write administration instructions
conduct a pilot test
conduct item analysis
revise the test
validate the test
develop norms and identify cut scores
compile test manual

New cards

Testing universe

body of knowledge or behaviors that the test represents

developer prepares working operational definition of the construct the test will measure

New cards

target audience

group of individuals who will take the test

New cards

purpose

what the test will measure and how scores will be used

normative
criterion approach

New cards

normative approach

compares test taker’s performance to other test takers
eg: academic achievement test where the highest score gets a scholarship

New cards

criterion approach

approach that compares a test taker’s performance to a specific set of criteria or a standard

New cards

what does developing a test plan entail?

specific construct’s operational definition, content to be measured, question format and administration and scoring of test

New cards

what are some scoring models?

cumulative, categorical, and ipsative

New cards

cumulative

assumes that the more a test taker responses in a particular fashion, the more the test taker exhibits attribute being measure
total number of correct answers becomes raw score

New cards

categorical

used to place test takers in a particular group or class and typically yields nominal data