\#separator:tab
\#html:true
A type of assessment that yields scores based on responses from test forms. test
"First used the term ""mental tests""" Cattell
Associated with the first modern-day intelligence test (measure higher mental processes) Binet
First psychological laboratory that used experimental research Wundt
First use of the term intelligence quotient (IQ); revised Binet Terman
Associated with the Stanford Achievement Test Thorndike
What was the era that first widely used group testing? WWI
Group administation of intelligence test for the military; reading literacy Army Alpha
Used as an intelligence test, but is the language-free version Army Beta
Research on vocational assessments Thorndike
Person involved in occupation selection for large groups of high school students Miner
First -- much more general career counseling for the future aptitude tests Strong
First modern personality inventory (WWI); measured suspectibility to mental health problems Woodworth's Personal Data Sheet
measure whether or not you're ready for something aptitude
"Defines purpose of test; demographics are considered; what context the test is on
![]()
" Step 1: Determine the goals of your client
Asks the questions: What behaviors, content, skills is it intended to measure? What is the
theory that the trait is based on? What about subsets/domains it is based on? Operationalization of test forms. Step 2: Choose instrument types to reach client goals
Item formats are determined; test is written and item reviewers make sure it measures what is intended to measure Step 3: Access information about possible instruments
Before this is done, a pilot test is done to make sure the items are valid, reliable, and fair, among other items.
Then, this happens. Step 4: Examine validity, reliability, cross-cultural, fairness, and practicality of the possible instruments
Validation process pilot test
Determines test length, testing time, scoring approaches, and test procedures, administers test materials. Step 5: Choose an Instrument Wisely
Tests which can be administered, scored, and interpreted by laypeople Level A
Tests that require a psychology degree or coursework in testing Level B
Tests that require an advanced psychology degree, a license and/or advanced training for that particular test Level C
Knowledge or skill not related to the purpose of the test is required to answer an item correctly. cognitive sources of construct-irrelevant variance
Language or images causes strong emotions that may interfere with the ability to respond to an item correctly (i.e. political opinions, beliefs) affective sources of construct-irrelevant variance
aspects of tests interfere with the test takers' ability to attend to, see, hear, or sense the items or stimuli (consider disabled people!) physical sources of construct-irrelevant variance
statistical relationship between two variables correlation coefficient
"Used to visually examine data, especially to discover patterns (such as curvilinear relationships)
![]()
" scatter plot
"an increase in one variable is related to an increase in the other variable
![]()
" positive relationship
"an increase in one variable is related to a decrease in the other variable
![]()
" negative relationship
"two variables that are not related to each other
![]()
" no relationship
"
±0.70 - 1.00
" strong correlation
"
±0.30 \~ 0.69
" moderate strength
"
±.00 \~ 0.29
" no strength
"
Whether
scores from a test is consistent
measure of
individuals’ true scores
" reliability
To measure reliability, we use correlation coefficient
caused by test administrators or the testing environment method error
Error associated with test takers, subjects themselves trait error
Relationship between scores on the same test administered twice with a time interval between the administration test-retest reliability
e.g., subjects may get better at second testing, subjects knowing how they answered in a similar test form practice effects
Coefficients of two equivalent tests are compared (time interval) alternate-forms reliability
obtaining a reliability coefficient by assessing how items are correlated as a group internal consistency
internal consistency; correlation between scores from even-numbered items and scores from odd-numbered items split-half reliability
whether a test measures what it is supposed to measure (
accuracy) validity
Does the ______ ______ cover a representative sample of behaviors to be measured in its entirety? Content experts content validity
Does a test predict the target trait it is intended to measure? criterion validity
Focuses on the prediction of current performance or psychological behavior concurrent validity
Focuses on the prediction of future performance or psychological behavior predictive validity
Does an assessment measure a theoretical construct that it is designed to measure (e.g., intelligence)? construct validity
Are two assessments measuring the
same (or similar) construct related? convergent validity
Are two asssessments measuring different constructs
not related? discriminant validity
Found construct you want to measure from the test scores factor analysis
whether an individual's score is not affected by potential bias inherent in a test, test procedure and interpretation fairness
Fairness did not get much attention until the 1960s (civil rights movement)
Equal testing condition + proctors fairness in testing process
Idea that all items should behave equally across all examinees fairness as lack of measurement bias
accessibility in testing; showing their status on target without being advantaged or disadvantaged by their individual characteristics or opportunity to learn fairness in access to the construct as measured
Statistical approach to examine test fairness by identifying items that perform differentially across subgroups of test takers while controlling for test takers' ability differential item functioning
examining response processes through probing questions cognitive interview
tests that measure what one has learned
e.g., high school exit exams achievement testing
measure what one is capable of learning
e.g., intelligence tests aptitude testing
used to assess habits, temperament, likes and dislikes, character, and similar behaviors personality assessment
tests that assess problem areas of learning; often used to assess learning disabilities diagnostic tests
tests that measure a broad range of cognitive ability
e.g., SATs cognitive ability tests
tests that measure a broad range of cognitive functioning in general intelligence, intellectual disabilities, giftedness, changes in overall cognitive functioning intellectual and cognitive functioning
tests that measure one aspect of ability; likelihood of success in a vocation special aptitude tests
tests that measure many aspects of ability; likelihood of success in multiple vocations multiple aptitude tests
tests that measure likes and dislikes as well as one's personality orientation toward the world of work; career counseling interest inventories
a tool whereby an individual identifies whether he or she has, or does not have, specific attributes or characteristics classification methods
tests that measure one's readiness for moving ahead in school. used to assess readiness to enter first grade readiness tests
How do you calculate IQ (use / as a division sign)? mental age/chronological age x 100
What formula is used for split-half reliability due to the test being cut in half? Spearman-Brown formula
"
visual for a categorical, discrete variable" "bar graph
![]()
"
visual for continuous variables "histogram
![]()
"
used to see the distributional shape of data "frequency polygon
![]()
"
"(Type of curve)
![]()
" positively skewed
"(Type of curve)
![]()
" negatively skewed
"Left to right, how are measures of central tendency distributed in positively skewed distributions?
![]()
" Mode < Median < Mean
"Left to right, how are measures of central tendency distributed in negatively skewed distributions?
![]()
" Mode > Median > Mean
avg of squared distance from the mean variance
the difference between an individual score and the mean deviation score
scores that are compared to a set of test scores called the norm group norm referenced
scores are compared to a predetermined standard; i.e. mastering a certain level of knowledge, used for diagnoses criterion-referenced scores
proportion of people falling at and below a score in a standard normal distribution percentile
µ = 50, σ = 10; used for personality tests T-scores
µ = 100, σ = 15; used for tests of intelligence deviation IQ
µ = 5, σ = 2, round to nearest whole number; used for achievement testing Stanines
µ = 5.5, σ = 2, round to nearest whole number; used for personality inventories and questionnaires Sten scores
µ = 50, σ = 21.06; used for educational tests NCE scores
µ = 500, σ = 100 SAT scores
µ = 21, σ = 5 ACT scores
µ and σ are artbitrarily set by publisher Publisher type scores
σ of test scores x √1 - reliability of a test SEM
Tells us how much error there is in the test and ultimately how much any individual's score might fluctuate due to this error standard error of measurement
problems with the _____ of questions comprehension
failure in the information retrieving to answer (related to background characteristics) information retrieval
low motivation/intention of faking or impression enhancement decision process
mismatch in the choice of response option; difference in interpretation of option meanings response process
"
![]()
" interquartile range formula
Deviation score X (raw score) - M (mean score)
Deviation score squared Variance
σ x √1 - reliability of a test standard error of measurement
Converting into a standard score (z-score x σ) + µ
"involved in ""faking"", telling the truth" decision process
"mapping the response; what is the time of ""always"" or ""almost never""?" response process
# Interquartile Range Formula