1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Beta Error / Type II Error
False negative
Beta Error / Type II Error
Failing to reject the null hypothesis when it is false
Beta Error / Type II Error
You ignored it, but it was actually there.
You said there was none, but there actually was
Beta Error / Type II Error
You retained the null hypothesis because your result showed no relationship/significant difference, so you kept it, but you should have actually rejected it.
Ex. You said you didn't like him, but you actually did.
Construct Validity
The extent to which the test may be said to measure a theoretical construct or trait
The judgment about the appropriateness of inferences drawn from test score regarding individual standings on a construct
Construct Validity
The extent to which the test may be said to measure a theoretical construct or trait
Construct Validity
The judgment about the appropriateness of inferences drawn from test score regarding individual standings on a construct
Construct Validity
The definition of validity and _____ is highly similar.
umbrella validity
Some references say that construct validity is the "______" and under construct validity, you will find the evidence of validity.
Homogeneity
Evidences of changes with age
Evidences of pretest and posttest
Evidences of distinct groups
Convergent Evidence
Discriminant Evidence
Evidences of Construct Validity
Homogeneity
If your test is homogeneous, each item in your test is measuring the same thing; they are measuring the same psychological construct, and they are uniform.
Evidences of changes with age
There are psychological constructs, or constructs in general, that we expect will change over time.
If our test claims that it is measuring something that will change over time, then we should expect that the scores of those taking the exam will also change as they get older — there will be progressive changes as they age.
Evidences of pretest and posttest
The evidence that your test score has changed as a result of an experience, intervention, or program between the pretesting and posttesting.
Evidences of distinct groups
Also known as known groups validity
Evidences of distinct groups
Also known as known groups validity, we can provide such proof or evidence when test scores vary in a predictable way because of their inclusion in a particular group.
Example: Depression - you created a Depression Inventory to establish evidence of distinct groups, so you take clinically depressed individuals and those who are not depressed.
Since they belong to two groups (clinically depressed and not depressed), you can expect that the depression level of the clinically depressed group will be high, and the depression level of the non-depressed group will be low. If the results reflect this, then there is evidence of distinct groups.
Convergent Evidence
It is shown when scores on test tend to correlate highly in the predicted direction w/ scores on older, more established, and already validated tests designed to measure the same or a similar construct
Convergent Evidence
We establish this when the test scores we are validating correlate highly in a predicted direction (either positive or negative) with another test that is measuring the same construct or a similar one.
There should be theoretical or empirical evidence saying that they are actually similar.
Convergent Evidence
Example: You created a test for sensation seeking— in order to establish _____, you decided to correlate it with openness to experience, and your prediction was that they would correlate positively, because there is empirical evidence that openness to experience and sensation seeking are similar constructs.
Then, you conducted the validation, and you found that those two are positively correlated, which provides evidence for ____, and therefore construct validity.
Discriminant Evidence
Is shown when the validity coefficient shows little (insignificant) relationship between the test scores and/or other variables with which test score should not be theoretically related
Discriminant Evidence
Also known as divergent
Discriminant Evidence
Also known as divergent.
It is established when the validity coefficient shows little or insignificant relationship between the test scores we are validating and another variable.
Discriminant Evidence
You are correlating scores from two tests that are not theoretically related.
You should have a reason to say that they are not correlated, such as empirical evidence that shows they are not related.
So, because empirical evidence says they are not related, they should not highly or significantly correlate.
Severity Error
This is a rating error in which the rater's ratings are consistently overly negative due to his/her tendency to be too strict
Rating Errors
Leniency Error
Severity Error
Central Tendency Error
Halo Effect
Horn Effect
Leniency Error
lenient in scoring, marking or grading; rates too positively
Severity Error
tendency of the rater to be too strict or negative; rates too negatively
Central Tendency Error
ratings tend to cluster in the middle of the rating continuum
Halo Effect
tendency to give a particular ratee a higher rating than he objectively deserves because of the rater's failure to discriminate among conceptually distinct aspects of a ratee's behavior.
Horn Effect
The tendency for a single negative attribute to cause raters to mark everything on the low end of the scale
Leniency Error
You evaluate everything as very high.
Severity Error
You're too strict, you rate everything as very low.
Central Tendency Error
You tend to rate them in the middle.
Ex. On a Likert scale from 1 to 5, you always choose 3.
Halo Effect
You tend to give a higher rating to a particular person than they actually deserve.
You don't really discriminate their performance over a particular trait or characteristic of this person.
Ex. You gave a high grade to a student who is polite and kind, even though their answer in the recitation was wrong.
Horn Effect
Ex. You gave a low grade to a student who is smart and knows the answers to your questions, but because they are naughty and disruptive, even though they are good, you gave them a low grade.
Test Construction
Involves writing the items and the decision making with regards to the scaling and scoring method that will be utilized
Item pool
a reservoir from which items will or will not be drawn for the final version of the test
at least twice
The item pool should contain ____ the number of the items expected to be included in the final version of the test.
Test Construction
It involves 3 things: writing the items, deciding what scaling (e.g. Likert) to be used, and what scoring method to be utilized.
Item Pool
We create a pool of items that can be used in the test, and these items may or may not make it to the actual test.
According to Cohen, it is recommended that the item pool should have at least twice the number of items expected to be included in the final version of the test.
If you plan for your test to have 50 items, your item pool should have 100 items.
Item Pool
It is recommended for this because as you go along the test development process—when you get to the test tryout and item analysis stages, you may have data that needs to be rejected, retained, or revised. Having an ____ that contains more than what you are expecting to use in the final version of the test would be helpful, because you no longer have to think. You can simply select from your ____.
Item Pool
Content validity should be kept in mind.
Item Pool
In writing an ____, it should cover the relevant aspects or relevant parts that you are measuring.
Test Construction
Jay-F has already decided the scaling and scoring method that he will use, and is now writing the test items for the depression scale that he is developing. Jay-F is in what stage of test development?
Item Analysis
In which the item difficulty index item reliability index, item validity index and item discrimination index are analyzed,
Item Analysis
We do this after the test tryout; once the data comes back to you, you must analyze each item.
You have to analyze its item difficulty, item reliability, item validity index, and item discrimination.
40
f Jay-F's target number of items is 20, he should at least have how many items in his item pool?
Easier
The larger the item difficulty index, the item is.
Components of Item Analysis
Item Difficulty Index
Item Discrimination Index
Item Reliability
Item Validity Index
Item Difficulty Index
It answers the question: what proportion of test takers answered each item correctly
Item Difficulty Index
How many of our test takers are getting the correct answer?
Ex. For item number 1, how many of our test takers got the correct answer?