1/71
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Mean
The sum of all values in a group
The most common type of average _______
The mean
What are the measures of Central tendency?
Mean, median, and mode
Median
The 50th percentile; the point in which 50% of the scores fall below or and 50% fall above
Mode
The value that occurs most frequently
What is the value that is the most general and is the least precise?
The mode
Variability
The amount of spread or dispersion in a set of scores
Variance
Variance is how much each square differs from the mean; the square of the standard deviation
Standard deviation
the average deviation from the mean
Norm referenced scores
scores that have meaning when compared to each other
Norms
A set of scores that represents a collection of individual performances
Criterion-referenced scores
Scores where the interpretation of whether the scores is “good” or not based on whether it meets a predetermined standard or not
Cut scores
A score on a test that has been predetermined as a standard or criterion. People above or below a cut score are placed in some category of performance, like pass or fail.
Percentile
a point in a distribution of scores below which a given percentage of scores falls
What does a percentile tell us?
It tells us the percentage of scores that are below a certain point.
Normal curve (bell-shaped curve)
a distribution of scores that is symmetrical about the mean, median, and mode and has asymptotic tails
Characteristics of a bell curve
The normal curve represents a distribution of values where the mean, median, and mode are equal to one another, if the mean and median are different, then the distribution would be skewed (normal curves are not skewed)
The normal curve is perfectly symmetrical about the mean
The tails of the curve are asymptotic (they never touch the horizontal axis)
Z score
The number of standard deviations between a raw score and the mean
How is a z score calculated?
A raw score is transformed by subtracting the mean from the raw score and then dividing that difference by the standard deviation of the set of scores.
How is a t-score calculated?
T= 50 + 10z
What is t-score?
It is a standard score that has a mean of 50 and a standard deviation of 10
Why are z scores useful?
They allow us to calculate probability of a score occurring within our normal distribution
SEM (standard error of measurement)
A simple measure of how much observed scores vary from a true score
How are the SEM and reliability related?
The larger the SEM, the lower the reliability of the test and the less precision there is of the measures taken and scores obtained
What are stanine scores?
Standardized 1-9 scores that are commonly used in psychometric testing./ It is a way to convert scores into a nine-point scale
Item Response Theory
An advance over Classical test theory that extends the definition of reliability as a function of the interaction between an item and the characteristics of the individual responding to the item
What does the x and y axis represent in IRT?
The x-axis represents the construct, the latent or underlying trait that the individual test taker brings to the item itself
the y-axis represents the probability of getting the item correct.
What is an advantage of IRT over Classical Theory?
It focuses on and estimates the ability of test taker independent of the difficulty of the items
How do we determine the worthiness of an item in IRT?
Using the a and b values. They are used and evaluated and can be refined, placed once again in the test pool of items, and reevaluated until they met the criteria we use to define good items
What are the two characteristics that distinguish items from one another and allow us to pass judgement on whether the item is a '“good” one?
Difficulty level and Discrimination level
Difficulty level (b)
It is the probability that a test taker will get an item correct
Discrimination level (a)
It is how well an item distinguishes between test takers at different ability levels
What tool led to more use of IRT?
Computers as they made calculating much easier
When does the test developer consider the test development process complete?
When each item fits the difficulty and discrimination level that the test author feel adequate.
What does a steep curve on an item characteristic curve indicate?
It indicates that the discrimination level(item difficulty) is high/ that there is a large difference in the probability of a correct response for those who theta response for those whose theta values differ.
The slope of the curve tells us _____
the discrimination level
What is the vertical straight line?
It is the probability of getting the item correct- represented by P(θ)
What is the horizontal straight line?
the x-axis that represents the latent trait.
What do achievement tests measure?
They measure how much someone knows or has learned.
What are the most common test taken?
Achievement tests
What process does standardized testing undergo?
It undergoes a process to determine psychometric value including validity, reliability, and standard error
What are teacher made tests?
Tests constructed by a teacher
What can be said about teacher-made tests?
They are very situation specific and defined to suit a particular need
What process has standardized test undergone?
Extensive test development- writing and rewriting of items, hundreds of administrations; development of reliability and validity data; norming with that is sometimes very large groups of test takers, development of consistent directions, administration procedures, and very clear scoring instructions
Norm-referenced tests
Tests where an individual’s test performance is compared with the test performance of other individuals
Criterion-referenced tests
One where there is a pre-defined level of performance used for evaluation
The ABCs of creating a standardized test
Preliminary ideas- the stage where the test developer considers the possible topics that might be covered, the level of coverage, and every other factor that relates to what may be on the finished test.
test specifications- a complex process that allows the test developer to understand the relationship between the level of items and the content of the items.
Items are written
items are used in a trial setting- instructions are drafted, participants are located, actual preliminary tests are constructed, items are tried, and items are analyzed
test developers rewrite items
final tests are assembled
an extensive national standardization effort
preparation of all the necessary materials
What is the table of specifications
It is a grid that serves as a guide to the construction of an achievement test
How many levels of abstraction in Bloom’s taxonomy?
Six levels
Knowledge- focuses on the recall of information; knowledge of dates, events, and places as well as certain major ideas.
Comprehension- focus on the understanding of information and require the test taker to interpret fats, compare and contrast different facts, infer cause and effect, and predict the consequences of a certain event
application- Require the use of information, methods, and concepts, as well as problem solving
analysis- requires the test taker to look for and see patterns among parts, recognize hidden meanings, and identify the parts of a problem
synthesis- the test taker is required to use old ideas to create new ones and to generalize from given facts
evaluation- require the test taker to compare and discriminate between ideas and make choices based on a reasonable and well-thought out argument
Words that knowledge would use
list, define, tell, describe, identify, show, label, collect, examine, tabulate, quote, name, who, when and where (in question format)
Words that comprehension questions might use
summarize, describe, interpret, contrast, and predict
Words that application level questions might use
apply, demonstrate, calculate, complete, illustrate, and show
Words that analysis-level questions might use
analyze, separate, order, explain. and connect
Words that synthesis level questions might use
combine, integrate, modify, rearrange, and substitute
Words that evaluation level questions might use
assess, decide, rank, recommend, and convince
What is an aptitude test?
It measures an individual’s potential
What do aptitude test measures?
They assess cognitive skills and knowledge
They also assess psychomotor performance
What are the types of aptitude test?
Mechanical aptitude test
Artistic aptitude tests
Readiness aptitude tests
Clerical aptitude test
Mechanical Aptitude test
Focus on a variety of abilities that fall into the psychomotor domain
Artistic aptitude test
Tests that evaluate artistic talent; music, drawing, and other forms of creative expression
Readiness aptitude test
A test to assess the developmental condition of an individual to determine whether or not a person is able to move on to the next phase of their education.
What did Gesell do?
He developed readiness tests that assess whether a child was read for school based on developmental periods
Clerical aptitude test
Tests that measure how well an individual performs tasks associated with administrative or clerical office work
What is the Differential Aptitude test?
A test that focuses on abilities and skills, such as verbal, numerical ability, abstract reasoning, mechanical reasoning, and space relations
used primarily in educational counseling and personnel assessment
What concepts does the term “high stakes” in regard to testing refer to?
It refers to tests that have a high risk, high reward impact such as admission into medical school or receiving certification for a job.
Which validity is most used when validating an achievement test?
Content-based validity
What is the GRE?
A test designed to assess the verbal, quantitative, and analytical reasoning abilities of graduate school applicants
Who: college juniors and seniors making applications for graduate school
What is the Terranova?
designed to measure achievement in the basic skills taught in schools throughout the nation
Used for K-12
Important: Teacher assesses not only achievement; but also the process that goes into thinking about the items on the test
What are the Iowa assessments? (ITBS)
To provide a comprehensive assessment of student progress in the basic skills
K-8
What is the GED
The GED was designed to “assess skills representative of the typical outcomes of a traditional high school education
No level specified
Note: first developed to assist veterans who didn’t have time to complete high school
What is the Denver developmental screening?
Is designed to screen for developmental delays
When: Birth to age 6
Note: Is now called Denver II
What does plus and minus mean regarding correlation?
No test can have a modicuk of reliability with a coefficient less than .00, so we just dispense with that idea and determine reliability coefficients to be worth considering only when they are positive