PSY Tests & Measurements Exam 1

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
GameKnowt Play
New
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/126

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

127 Terms

1
New cards

What is a psychological test 

a procedure or instrument that measures a construct or behavior to make interferences about human attributes, traits or characteristics 

2
New cards

why might items on two IQ tests be quite different?

because the test developers defined IQ differently based on their varying theories of intelligence

3
New cards

what were the first published tests of mental ability?

the Binet-Simon scale

4
New cards

What is the Flynn Effect?

the trend that the average IQ score increases with each new generation

5
New cards

what do self report tests require test takers to do? 

to report or describe their feelings, beliefs, opinions and/or mental states. 

6
New cards

Key assumptions of Psychological Tests 

  • An individual’s behavior and, therefor,e test scores will typically remain stable over time 

  • Psychological tests measure what they say they measure 

  • test takers will report accurately about themselves 

    • Test items are understood the same way 

7
New cards

what is not an assumption that test users make about psychological tests?

test scores are 99.9% accurate with little or no error

8
New cards

What is race norming?

ranking a minority test taker higher than a White test taker with the same test score

9
New cards

What type of tests are the Rorschach Inkblot Test and the Thematic Apperception Test?

Projective tests 

10
New cards

Binet’s psychological tests were designed to evaluate

Children

11
New cards

What do all psychological tests have in common? 

Using evidence to reach conclusions 

12
New cards

Three defining characteristics of good tests 

  1. Representatively sample the behaviors thought to measure a construct 

  2. Behavior samples are obtained under standardized conditions (test must be administered the same way to all people) 

  3. Have rules for scoring to ensure consistency 

13
New cards

Maximal Performance

  • classification of test by behavior

  • test takers perform well defined task (eg IQ, driving tests) and try to do their best

14
New cards

Behavior Observation

  • classification of test by behavior performed

  • involves observing people’s behavior in a particular context, often without them knowing

15
New cards

Self report 

  • classification of test by behavior performed 

  • test takers describe their own feelings, beliefs, or opinons 

16
New cards

Standardized tests

  • classification of test by standardization

  • administered to a large group a standardization sample) to create norms for score comparison

    • specific directions for administration and scoring

17
New cards

Nonstandardized tests

  • classification of tests by standardization

  • more informal, often for single administration (i.e: do not have standardization sample)

18
New cards

Objective tests

  • classification of test by scoring method

  • have predetermined correct answers and require little subjective judgment to score.

  • structured formates like MC, T/F, or rating scales

19
New cards

Projective tests

  • classification of test by scoring method

  • test takers respond to ambiguous stimuli (i.e: Rorschach inkblots, Thematic Apperception Test)

  • scores involve subjective judgment

20
New cards

Achievement tests

  • by dimension measured

  • measures previous learning in a specific academic area

21
New cards

Aptitude tests

  • by dimension measured

  • assess potential for learning or ability to perform in a new situation

22
New cards

Intelligence tests

  • by dimension measured

  • assess the ability to cope with the environment at a broad level

23
New cards

Personality tests

  • classification of test by dimension measured

  • measures human character or disposition

24
New cards

Interest inventories

  • classified by dimension measured

  • assess interests to help with career decisions

25
New cards

Psychological assessment

  • broad process of gathering information about an individual using multiple methods, including interviews, observations, and psychological tests

  • one tool in this process: psychological test

26
New cards

Measurement

process of assigning numbers to attributes accroding to specific rules

  • broader concept than a test

27
New cards

Survey

focuses on group outcomes and reports results at the question level (such as percentages) P

28
New cards

Psychological test 

focuses on individual outcomes and provides an overall derived score or scaled scores 

29
New cards

What are some key historical developments of creating psychological test?

  • created to screen emotional instability during war

  • IQ test for children (Binet-Simon Scale)

  • developed Army Alpha (literate recruits) & Army Beta (non-literate/non-english speaking) during WWI

30
New cards

What are some major controversies of psychological tests during it’s development

  • discrimination against racial, economic or cultural groups

  • nature v.s nurture: IQ; be different tests because developers defined IQ differently based on theories

  • Within-group norming: race norming

  • Flynn Effect

31
New cards

Flynn Effect 

observation that average IQ scores have been increasing with each new generation 

  • due to changes in how new generations think (“mental artillery”) 

32
New cards

Race Norming

  • within group norming

  • practice of administering the same test to every test taker but scoring test differently according to race of the test taker

  • Outlawed by Civil Rights Act of 1991

33
New cards

Nominal measurement

  • numbers are used as labels for categories of data; just naming

  • statistical analysis to use: Frequency, Mode, Chi-square

  • ex: 1= democrat, 2= republican

34
New cards

Ordinal 

  • numbers are used to rank order data, but the interval between the ranks ARE NOT equal or can vary 

  • statistical analysis to use: median, percentile, rank-order correlation

  • i.e, class rank, Likert scales, grade equivalents 

35
New cards

Likert scales are seen and treat as what measurement and why?

  • Ordinal or interval but are treated as interval scales assuming that each point on the rating scale represents an equal distance or amount of the construct being measured

36
New cards

Interval measurement

  • numbers are rank ordered with equal distances between them, but there is no absolute zero

  • statistical analysis to use: mean, standard deviation, correlation, t-test, ANOVA

37
New cards

Ratio measurement

  • numbers are rank ordered with equal distances between them but there is a true meaningful zero point

  • statistical analysis to use: all parametric analyses

38
New cards

Frequency distributions

orderly arrangment of scores showing the number or percentage of observations within a range/category

  • displayed as histogram sometimes

39
New cards

Normal (Bell) Curve 

symmetrical bell shaped theoretical distribution where most scores cluster near the middle (mean) 

  • shaped determined by mean and SD

40
New cards

With a smaller standard deviation what would the normal curve then look like?

narrow and tall

41
New cards

measures of central tendency

  • describes middle of a distribution

  • mean, median, mode

42
New cards

mean 

  • μ or xˉ 

  • average, best for symmetrical distributions ,but is impacted by outliers 

    • unusually high or low scores

43
New cards

median

  • middle score when all scores are ordered

  • not impacted by outliers and better for skewed distributions

44
New cards

mode 

most frequently occurring score in a distribution 

45
New cards

measures of variability

  • describes how spread out the scores are

  • range, variance, standard deviation

46
New cards

range

highest score in a distribution minus the lowest score

47
New cards

variance 

  • σ² 

  • indicates whether individual scores tend to be similar to or substantially different from the mean 

48
New cards

standard deviation

  • σ

  • most commonly used measure of variability

  • square root of variance

  • allows us to understand how scores are distributed around the mean in a normal curve

49
New cards

when the tail of a bell curve is to the right side

it is positively skewed

  • median is smaller than mean

50
New cards

when the tail of a bell curve is to the left side

it is negatively skewed

  • median is higher than mean

51
New cards

approx. 68% of scores fall within ± __ SD of the mean

± 1 SD

52
New cards

approx. 95% of scores fall within ± _ SD of the mean 

± 2 SD 

53
New cards

approx. 99.7% of scores fall within ± _ SD of the mean

± 3 SD

54
New cards

measure of relationship

  • describes distributions of test scores

  • must have at least two sets or distribution of scores to calculate this

  • correlation coefficient

55
New cards

correlation coefficient

  • describes r/s between two or more distribution of scores

  • whether the same individuals scored similarity on two different tests

  • measured on interval or ratio scale

  • -1.0 to +1.0

<ul><li><p>describes r/s between two or more distribution of scores </p></li><li><p>whether the same individuals scored similarity on two different tests </p></li><li><p>measured on interval or ratio scale </p></li><li><p>-1.0 to +1.0</p></li></ul><p></p>
56
New cards

positive correlation coefficient 

  • r > 0 

    • one score increases the other tends to increase 

<ul><li><p>r &gt; 0&nbsp;</p><ul><li><p>one score increases the other tends to increase&nbsp;</p></li></ul></li></ul><p></p>
57
New cards

negative correlation coefficient

  • r < 0

    • as one score increases, the other tends to decrease

<ul><li><p>r &lt; 0</p><ul><li><p>as one score increases, the other tends to decrease</p></li></ul></li></ul><p></p>
58
New cards

zero correlation coefficient 

  • r = 0 

  • no relationship

<ul><li><p>r = 0&nbsp;</p></li><li><p>no relationship</p></li></ul><p></p>
59
New cards

perfect positive correlation 

knowt flashcard image
60
New cards

strong positive correlation

knowt flashcard image
61
New cards

weak positive correlation

knowt flashcard image
62
New cards

weak negative correlation 

knowt flashcard image
63
New cards

strong negative correlation

knowt flashcard image
64
New cards

perfect negative correlation

knowt flashcard image
65
New cards
<p>what is the formula for standard deviation (for a population)? </p>

what is the formula for standard deviation (for a population)?

  1. find deviation o each score from the mean (x-µ)²

  2. sum the squared deviations ∑(x-µ)² 

  3. divide by N to get variance (σ²) 

  4. take square root of variance 

66
New cards
<p>what is the formula for standard deviation (for a sample)?&nbsp;</p>

what is the formula for standard deviation (for a sample)? 

67
New cards

reliability

  • consistency of test scores 

  • essential standards for determining how trustworthy data derived from a psychological test are

  • trust to measure each person and construct in approximately the same way every time it is used 

    • contains some errors 

68
New cards

what can impact a person’s measured score?

measurement errors such as mistakes of test taker or test administers, response bias, changes in environmental conditions, flaw or inaccuracy in measuring instrument etc

69
New cards

what makes test reliable?

measures each person in approximately the same way each time it is used

  • produces consistent results when applied multiple times times or in different circumstances

70
New cards

Classical Test Theory

  • every observed score (X) is composed of a true score (T) and a random error score (E)

  • X = T + E

  • error will create normal distribution

71
New cards

what are the two types of error score (measurement error)?

  • random error

  • systematic error

72
New cards

random error 

  • variability in test scores that is due to unpredictable and uncontrollable factors which lowers reliability of test 

  • normally distributed & uncorrelated with true score 

    • environmental conditions, temporary distractions, fluctuations in individuals’ performance 

73
New cards

systematic error

  • when a single source of error consistently increases or decreases the true score by the same amount

  • can be difficult to identify which distorts the real score

    • A bathroom scale that always reads 3 lbs higher

74
New cards

three main categories of methods to estimate reliability/precision of the test

  1. test-retest method

  2. alternate forms method

  3. internal consistency methods

75
New cards

test-retest method

  • test developers gives the same test to the same group of test takers on two different occasions and compared using correlation from the first and second administration to examine the stability of test scores over time

  • limitations: practice effects

76
New cards

practice effects 

test takers benefits from taking the test the first time due to practice which enables them to solve problems more quickly and correctly the second time 

77
New cards

alternate- forms method

  • test developers create two different forms of the test to be as alike as possible to the same people to measure the equivalence of the forms

  • scores are compared using correlation

  • overcomes practice effects but has order effects

78
New cards

order effects

  • changes in scores resulting from the order the test were taken

  • avoid this by having half test takers receiving form A and the other form B

79
New cards

internal consistency methods

  • A single test administration is used to see how related the items (or group of items) on the test are to one another

  • How a person answered one item on the test would give you information that would help you correctly predict how they answered another item on the test

  • coefficent alpha

80
New cards

coefficient alpha

  • Cronbach’s alpha = internal consistency coefficient

  • it items are truly the same construct naturally should be correlated with one another

  • only appropriate for homogenous test (measuring one trait or characteristic)

  • ranges 0.00-1.00 (perfeclty relaibile)

  • higher value = greater consistency

  • median: .85

81
New cards

how can the test itself influence reliability? 

  • being poorly designed 

  • ambiguous questions 

  • poorly written questions 

  • require a higher reading level than the level of test takers 

82
New cards

how can the test administration influence reliability? 

  • when directions are not followed

  • misread instruction for length of time

  • answer participant questions incorrectly

  • allow test environment to be hot, cold or noisy

  • display a negative or uncomfortable attitude

83
New cards

how can the test scoring influence reliability? 

  • not conducted accurately

  • e.g: WAIS similarity test item what the words apple and orange have in common?

84
New cards

how can the test takers influence reliability? 

  • contribute to test error

  • fatigue

  • illness

  • exposure to the test questions or research questions before the test

  • social desirability

85
New cards

what are the steps of test development? 

  1. Define the testing universe, target audience, and test purpose 

  2. develop a test plan 

  3. compose test items 

  4. write administration instructions 

  5. conduct a pilot test 

  6. conduct item analysis 

  7. revise the test 

  8. validate the test 

  9. develop norms and identify cut scores 

  10. compile test manual 

86
New cards

Testing universe

body of knowledge or behaviors that the test represents

  • developer prepares working operational definition of the construct the test will measure

87
New cards

target audience

group of individuals who will take the test

88
New cards

purpose

what the test will measure and how scores will be used

  • normative

  • criterion approach

89
New cards

normative approach 

  • compares test taker’s performance to other test takers 

  • eg: academic achievement test where the highest score gets a scholarship 

90
New cards

criterion approach

  • approach that compares a test taker’s performance to a specific set of criteria or a standard

91
New cards

what does developing a test plan entail?

specific construct’s operational definition, content to be measured, question format and administration and scoring of test

92
New cards

what are some scoring models? 

cumulative, categorical, and ipsative 

93
New cards

cumulative

  • assumes that the more a test taker responses in a particular fashion, the more the test taker exhibits attribute being measure

  • total number of correct answers becomes raw score

94
New cards

categorical

used to place test takers in a particular group or class and typically yields nominal data

  • a personality test

95
New cards

Ipsative

forced choice format where a test taker’s preferences are compared to themselves rather than normative group.

  • total score will be exactly the same for everyone

96
New cards

how many items should test developers write when developing a test? 

twice as many as the final version 

97
New cards

objective formats

  • one response that is designated as correct

  • MC, T/F,

  • incorrect MC = distractors

98
New cards

subjective formats

  • do not have single responses designated as correct and require judgment to score

    • essay, interviews, projective techinques

99
New cards

Response set/bias

patterns of responding that can result in false or misleading information

100
New cards

social desirability 

tendency for some test takers to provide or choose answers that are socially accepted or present them in a favorable light