Assess Individual Exam 2

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/63

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:42 PM on 4/1/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

64 Terms

1
New cards

Random error

causes a person's test score to change from one administration of a test to the next

2
New cards

Which type of error is more relevant for reliability?

Random error lowers the reliability of a test

3
New cards

Classical test theory

observed score = true score + error

4
New cards

Systematic error

When a single source of error always increases or decreases the true score by the same amount

5
New cards

Name 3 reliability types

1. Test-retest method

2. Alternate-forms method

3. Internal consistency method

6
New cards

What does the test-retest method tell you about the test?

Allows us to examine the stability of test scores over time and provides an estimate of the test's reliability/precision

7
New cards

What does the alternate-forms method tell you about the test?

It is a way of evaluating a test's reliability - how consistently it measures whatever it's supposed to measure

8
New cards

What does the internal consistency method tell you about the test?

How well the items within a test work together to measure the same underlying construct

9
New cards

What is test-retest reliability?

Administer the same test to the same people at two points in time

10
New cards

What needs to be done in terms of test administration to be able to calculate it?(test-retest reliability)

Pearson product - moment correlation

11
New cards

What is internal consistency reliability?

Give the test in one administration, and then compare all possible split halves

12
New cards

What needs to be done in terms of test administration to be able to calculate it?(internal consistency reliability)

Coefficient alpha or KR-20

13
New cards

What is interrater reliability?

Give the test once, and have it scored by two scorers or two methods

14
New cards

What needs to be done in terms of test administration to be able to calculate it? (Interrater reliability)

Pearson product - moment correlation

15
New cards

What is scorer reliability?

how consistently a test is scored when human judgment is involved

16
New cards

What needs to be done in terms of test administration to be able to calculate it?(scorer reliability)

Set up the test administration so that multiple scorers can independently score the same set of responses

17
New cards

How does the test-retest interval influence test reliability?

Test-retest reliability will decline because the number of opportunities for the test takers or the testing situation to change increases over time

18
New cards

Given this influence, how should a test developer decide how long to wait to do retesting?

The interval should be long enough to forget the answers, but short enough that the trait hasn't changed

19
New cards

What is the standard error of measurement and why is it useful?

An estimate of how much the individual's observed test score might differ from the individual's true test score

20
New cards

How is the standard error of measurement useful

It tells you how much trust you can put in a test score

21
New cards

List and describe two things about test administration that can influence reliability

-Consistency of testing conditions

-Standardization of instructions

22
New cards

List and describe two things about test scoring that can influence reliability.

-Using the correct scoring key

-Consistency of Scorers

23
New cards

List and describe two things about test takers that can influence reliability

-Effort and motivation

-Understanding and attention

24
New cards

How would you explain to someone that while reliability is largely a function of the test itself, validity is not?

Reliability is about the test's consistency - something built into the test

Validity is about whether the test scores are meaningful for a specific purpose - something that depends on how and why the test is used

25
New cards

What are the three traditional types of validity?

Content validity, criterion-related validity, and construct validity

26
New cards

What is content validity

Whether the test's items accurately and completely represent the material or skills it is supposed to measure

27
New cards

What is criterion-related validity?

Tells you whether test scores actually connect to real-world outcomes or established measures in the way they should

28
New cards

What is construct validity?

the degree to which a test measures what it claims, or purports, to be measuring

29
New cards

List and describe four of the sources of information for evidence of validity

-Evidence based on test content

-Evidence based on the response process

-Evidence based on internal structure

-Evidence based on relations with other variables

30
New cards

Evidence based on test content

Looks at what is actually on the test - the questions, tasks, wording, and format

-Ask whether the test content fully represents the construct it is supposed to measure and avoids including irrelevant material

31
New cards

Evidence based on response process

Examine how test takers think, behave, or respond while completing the test.

- It checks whether the mental processes used by test takers match what the test is intended to measure

32
New cards

Evidence based on internal structure

How the items on the test relate to each other and whether the test behaves the way the underlying theory says it should

33
New cards

Evidence based on relations to other variables

This examines how test scores relate to other measures

- expected relationships

-lack of relationships with unrelated constructs

34
New cards

Describe the process for assessing content validity

Assessing content validity involves having a panel of experts review each test item and rate whether it is essential, useful but not essential, or not necessary for measuring the construct. Using Lawshe's method, a Content Validity Ratio (CVR) is calculated for each item to determine the level of expert agreement. Items that meet the minimum CVR value are kept as evidence of validity, while items that do not meet the standard are revised or removed. This process ensures the test content accurately represents the construct being measured.

35
New cards

Explain what information about test validity this assessment of content validity provides

Content validity evidence shows whether the test items are essential, relevant, and representative of the construct

-supports the argument that test scores meaningfully reflect what the test claims to measure

36
New cards

What is face validity?

Whether a test looks like it measures what it says it measures

37
New cards

Give an example of when a test would be face valid

A math test that contains only math problems has high face validity because it appears to measure math skills.

38
New cards

Explain both why face validity is desirable and why it might be a problem for a psychological test.

Tests that look relevant increase cooperation and perceived fairness. However, it can be a problem because it is based only on appearance, not scientific evidence, and highly face‑valid tests can be easier to fake.

39
New cards

What is the predictive method

Shows a relationship between test scores and future behavior

40
New cards

Describe the process for assessing the predictive method of validation.

Give a test (predictor), wait a set time, then measure later performance (criterion). Correlate the test scores with the future performance scores. A strong correlation shows the test predicts future behavior.

41
New cards

When is it most common to use this method? (predictive method)

test needs to forecast future performance, especially in employment settings (predicting job performance), educational settings (predicting academic success), and clinical settings (predicting future outcomes)

42
New cards

What is the concurrent method?

Test administration and criterion measurement happen at approximately the same time

43
New cards

Describe the process for assessing the concurrent method of validation.

Administer the test and a valid criterion measure to the same group at the same time, then correlate the two sets of scores. A strong correlation provides concurrent evidence of validity.

44
New cards

When is it most common to use this method (concurrent method of validation)

when researchers need immediate evidence of validity, especially in employment, educational, and clinical settings where test scores can be compared to current performance or diagnoses

45
New cards

What is a validity coefficient in criterion-oriented validation?

It is the correlation between test scores and a criterion measure, showing how well the test predicts or reflects the outcome.

46
New cards

What is restriction of range?

Occurs when a sample's data is limited to a narrow subset of the total population, weakening observed correlations

47
New cards

What is the typical influence of range restriction on validity coefficients?

Range restriction usually lowers the validity coefficient because reduced variation in test scores weakens the correlation between the test and the criterion.

48
New cards

Why do we care about this influence? (range restriction on validity coefficient)

Because range restriction lowers the validity coefficient, it can make a test appear less predictive than it truly is. This may lead to incorrect conclusions about the test's usefulness in hiring, admissions, or clinical decisions.

49
New cards

How is evidence of validity from relationships with external criteria different than validity from content?

Content validity evaluates whether the test items represent the construct's domain, while criterion‑related validity examines how test scores relate to an external measure, using correlations to show prediction or real‑world performance.

50
New cards

What is the difference between objective and subjective criteria?

Objective criteria are measurable and based on observable facts, while subjective criteria rely on personal judgments or opinions.

51
New cards

Describe one objective and one subjective criterion in educational settings?

Objective criterion - standardized test score

Subjective criterion - teacher's rating of class participation

52
New cards

Why is the choice of criterion measures important in interpreting validity coefficients and test validity?

Because the validity coefficient depends on the quality of the criterion. If the criterion is unreliable or poorly matched to what the test measures, the correlation will be inaccurate and may underestimate or misrepresent the test's true validity

53
New cards

What do tests of significance and the coefficient of determination tell us about validity coefficients?

Tests of significance show whether the validity coefficient is statistically meaningful, while the coefficient of determination (r2) shows how much of the variation in the criterion is explained by the test scores.

54
New cards

How are these methods different in the questions they answer? (test of significance and coefficient of determination)

Tests of significance ask whether the validity coefficient is statistically real or due to chance, while the coefficient of determination asks how much of the criterion's variation is explained by the test scores.

55
New cards

What does linear regression allow us to do that validity coefficients do not when making inferences about validity?

Linear regression allows us to make specific predictions of criterion scores from test scores and estimate prediction accuracy, while validity coefficients only show the strength of the relationship.

56
New cards

How is multiple regression different from linear regression?

Linear regression uses one predictor to estimate a criterion, while multiple regression uses several predictors at once and shows how much each one uniquely contributes to predicting the outcome.

57
New cards

Why is multiple regression useful in determining how many tests to use in a selection battery?

Because multiple regression shows the unique contribution each test makes to predicting job performance, helping identify which tests add value and which are redundant so the selection battery can be efficient and effective.

58
New cards

Describe the process for assessing construct validity?

Define the construct, make predictions about how the test should relate to other variables, collect data to test those predictions (e.g., convergent, discriminant, factor‑analytic, and group‑difference evidence), and evaluate whether the results support the test as a measure of the intended construct.

59
New cards

Explain what kind of information about test validity an assessment of construct validity provides

It shows whether the test truly measures the intended psychological construct by examining how the test relates to theory, other measures, group differences, and its internal structure. This reveals what the test actually measures and how meaningful its scores are.

60
New cards

List and describe four ways to establish quantitative evidence about construct validity

-Convergent evidence: The test correlates strongly with measures of related constructs.

-Discriminant evidence: The test shows low correlations with unrelated constructs.

-Factor‑analytic evidence: The test's internal structure matches the theoretical structure of the construct.

-Group‑difference evidence: The test differentiates between groups that theory predicts should score differently.

61
New cards

What is factor analysis?

An advanced procedure based on the concept of correlation that helps investigators explain why items within a test are correlated or why two different test are correlated

62
New cards

Why is factor analysis useful for construct validity and testing in general?

Because it reveals the test's underlying structure, shows whether items measure the intended construct, identifies subscales, and helps detect weak or misfitting items—providing strong evidence that the test measures what it claims to measure.

63
New cards

What is the difference between confirmatory and exploratory factor analysis?

Exploratory factor analysis (EFA) is used to discover the underlying factor structure without prior assumptions, while confirmatory factor analysis (CFA) tests whether a hypothesized factor structure fits the data.

64
New cards

Still learning (4)

You've started learning these terms. Keep it up!

Explore top flashcards

flashcards
Week 1
20
Updated 716d ago
0.0(0)
flashcards
Introduction to Biology
33
Updated 446d ago
0.0(0)
flashcards
Classical Roots Lessons 7-8
42
Updated 1146d ago
0.0(0)
flashcards
Civil Rights and Liberties
38
Updated 1075d ago
0.0(0)
flashcards
units 1-7 vocab
361
Updated 1081d ago
0.0(0)
flashcards
Survey of Humanities- Boroque
40
Updated 925d ago
0.0(0)
flashcards
Week 1
20
Updated 716d ago
0.0(0)
flashcards
Introduction to Biology
33
Updated 446d ago
0.0(0)
flashcards
Classical Roots Lessons 7-8
42
Updated 1146d ago
0.0(0)
flashcards
Civil Rights and Liberties
38
Updated 1075d ago
0.0(0)
flashcards
units 1-7 vocab
361
Updated 1081d ago
0.0(0)
flashcards
Survey of Humanities- Boroque
40
Updated 925d ago
0.0(0)