Phase 3: Planning Standardization

0.0(0)
studied byStudied by 6 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/33

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

34 Terms

1
New cards

Steps in Phase 3

  • Describe the reference or norm group and sampling plan for standardization

  • Describe your choice of scaling methods and the rationale for this choice

  • outline the reliability studies to be performed and their rationale

  • outline the validity studies to be performed and their rationale

  • include any special studies that may be needed for development of this test or to support proposed interpretations of performance

  • List the components of the test - e.g manual, record forms, test booklets, stimuli materials etc

2
New cards

Sampling Plan

  • defining the target population for comparison (age range, special needs etc)

    • the first group that everyone is compared to

  • Looking at other norm groups

    • other groups you want to compare against

  • you want a true random sample but this is often not possible

  • sample has to be representive

    • best method is to use a population proportionate stratified random sampling plan

  • Determine the appropiate size of the sample overall

3
New cards

Standardized Sample

  • a sample of test takers who represent the population for which the test is intended to measure

    • determines the norms and forms of the reference group all the examinees are compared to

4
New cards

Population

all members of the target audience

5
New cards

Sample

administering a survey to a representative subset of the population

6
New cards

Types of Sampling (Selecting the Appropiate respondents)

  • Probability Sampling

  • Simple Random Sampling

  • Systematic Random Sampling

  • Stratified Random Sampling

  • Cluster Sampling

  • Nonprobability Sampling

  • Convenience Sampling

7
New cards

Probability Sampling

  • Uses statistics to ensure that a sample is representative of a population

8
New cards

Simple Random Sampling

  • every member of a popuation has an equal chance of being chosen as a member of the sample

9
New cards

Systematic Random Sampling

  • Choosing every nth person (e.g every third person)

10
New cards

Stratified Random Sampling

  • Population is divided into subgroups

11
New cards

Cluster Sampling

  • used when it is not possible to list all of the individuals who belong to a particular population and is a method often used with surveys that large target populations

12
New cards

Nonprobability Sampling

  • Is a type of sampling in which not everyone has an equal chance of being selected from the population

13
New cards

Convience Sampling

  • The survey researcher uses any available group of participants to represent the population

14
New cards

Sample Size

  • number of people needed to represent the target population accurately

  • Depends on the factors of the test plan of how many ppl you need - larger the better i believe (thats what she said)

15
New cards

Homogeneity of the Population

  • how similar the people in your population are to one another

16
New cards

Sampling Error

  • a statistic that reflects how much error can be attributed to the lack of representation of the target population by the sample of respondents chosen

17
New cards

Distributing the Survey

  • how will the instrument/test be given to the respondent

  • mail, phone, weblink, in person

18
New cards

Specifying Administration and Scoring Methods

  • Determine such things as how test will be administered which will influence the format and content of the test items

    • orally, written, computer, groups, individual

  • Methods of scoring chosen like if its scored by hand by test administrator, scoring software, sent to test publisher for scoring

19
New cards

Types of Raw Scoring Methods

  • Cumulative/Summative Model

  • Ipsative Model

  • Categorical Model

20
New cards

Cumulative/Summative Model

  • most common

  • assumes that the more a test taker responds in a particular fashion the more they have of the attribute being measured

  • using this model, the test taker recieves 1 point for each correct answer and the total number of correct answers becomes the raw score on the test

  • correct responses or responses on the Likert scale are summed

    • data can be interpreted with reference to norms

  • Semantic Differential: Adjective pairs at each end of the continuum (e.g rich/poor)

  • Visual analog: the researcher assign scores through the continuum (e.g rating pain levels, each number a diff level of pain)

21
New cards

Ipsative Model

  • test takers are given 2 or more options to choose from - mostly uses forced choice items

  • most used in personality testing - test taker indicates which items are most like them and least like them

  • measures an individual's personal growth, strengths, or preferences relative to themselves over time, rather than comparing them to others (normative) or external standards (criterion-referenced)

  • all items are chosen to be equally desireable

  • typically yields nominal data because it places test takers in categories - e.g., The # of T or F, Y or N, Agree or Disagree

22
New cards

Categorical Model

  • Is used to put the test taker in a particular group or class

  • test takers scores are not compared to that of other test takers but compare the scores on various scales within the test taker (which scores are highest and lowest)

  • Typically yields nominal data b/c it places test takers in categories

    • counts the number true and false answers, agree and disagree

23
New cards

Piloting and Revising Tests

  • cant assume the test will perform as expected

  • test developers conduct studies to determine how well a new test

performs.

  • Pilot test is a scientifically investigation of evidence that suggests that the test scores are reliable and valid for their specified purpose

  • involves administering the test to sample from target audience

  • analyze data and revise test to fix any problems uncovered

    • many aspects to consider

24
New cards

Setting up the Pilot Test

  • Test situation should match actual circumstances in which test will be used

    • e.g if the test is designed to diagnose emotional disabilities in adolescents, the participants for the pilot study should be adolescents.

  • The sample should be large enough to provide the power to conduct statistical tests to compare the responses of each group

  • the test setting of the pilot test should mirror the planned test setting.

    • e.g If school psychologists will use the test, the pilot test should be conducted in a school setting using school psychologists as administrators.

  • developers must follow the american psychological associations codes of ethic

    • strict rules for confidentiality and publish only combined results

    • test takers understand that theyre in a research study/scores are used for research purposed

25
New cards

Conducting the Pilot Test

  • a scientific evaluation of the tests performance

  • depth and range depends on the size and complexity of the target audience and the construct being measured

    • e.g tests designed for use in a single company or college program require less extensive studies than do tests designed for large audiences such as students applying for graduate school

  • adhere strictly to test procedures outlined in test administration instructions

  • generally require large sample

  • may also ask participants about the testing experience

  •  pilot studies often require gathering extra data such as a criterion measure

    and the length of time needed to complete the test.

  • important to recognize the problems with the test administration, make all necessary revisions before continuing, and conduct a new pilot test that yields appropriate results.

26
New cards

Analyzing the Results

  • Can gather both quantitative and qualitative information for things like item characteristics, internal consistency, test-rest, inter-rater, convergent and discriminate validity and sometimes predictive validity

  • Qualitative data may be used to help make decsions

27
New cards

Conducting Quantitative Item Analysis

  • Item analysis: how developers evaluate the performance of each test item

  • Item difficulty: The percentage of test takers who respond correctly vs total number of people to assess the p value (percentage/probability value)

    • understand how difficult an item is

  • p value of .5 is optimal (higher mean its too easy, lower is too hard)

    • 0 to .2 (too difficult) and .9 to 1 (too easy)

28
New cards

Discrimination Index

  • Compares the performance of those who obtained very high test scores (Upper group) with the performance of those who obtained very low test scores (lower group) on each item

  • D= U(# of upper group who responded correctly/total # in upper group x 100) - L (# of lower group who responded correctly/total # of lower group x 100)

  • D = U - L

  • A discrimination index of 30 and above is desirable

  • Negative numbers: those who scored low on the test overall responded to the item correctly and those who scored high on the test responded incorrectly.

  • The upper group and lower group are formed by ranking the final test scores from lowest to highest and then taking the upper third and the lower third to use in the analysis.

29
New cards

Item-Total Correlation

  • a measure of the strength and direction of the relation between the way test takers responded to one item and the way they responded to all of the items as a whole

  • Items that have little or no correlation with the total item score may measure a different construct from that being measured by the other items.

30
New cards

Interitem Correlation Matrix

  • displays the correlation of each item with every other item

  • Usually each item has been coded as a dichotomous variable (correct (1) or incorrect (0))

    • Therefore, the interitem correlation matrix will be made up of phi coefficients

  • provides important information for increasing the test’s internal consistency.

    • drop items that dont correlate with other items measuring the same construct

31
New cards

Phi Coefficients

  • The results of correlating two dichotomous (having only 2 values) variables

32
New cards

Item Response Theory

  • estimates of the ability of test takers that is independent of the difficulty of the items presented as well as estimates of item difficulty and discrimination that are independent of the ability of the test takers.

  • relates the performance of each item to a statistical estimate of the test taker’s ability on the construct being measured

33
New cards

Item Characteristic Curves

  • Part of Item response theory

  • the line that results when we graph the probability of answering an item correctly with the level of ability on the construct being measured.

  • provides a picture of the item’s difficulty and how well it discriminates high performers from low performers

34
New cards

Item Bias

  • when an item is easier for one group than for another group.