EDU 423: Assessment & Evaluation - Standardized Tests

Types of Tests

Tests can be categorized into several types:

Published Achievement Tests: These are standardized and empirically documented.
- Multilevel survey batteries: Annually administered tests that survey students' general educational growth or basic skill development in several curricular areas. "Multilevel" means the test content spans several grade levels and "battery" means that several curricular areas are assessed by different subtests.
- Multilevel criterion-referenced tests: They provide detailed information about students' status for a well-defined domain of performance in a single subject area (e.g., mathematics). The test spans several grade levels.
- Non-criterion-referenced tests: They assess students in a broader way than subtests in a survey battery.
- Single-level standardized tests: Developed for assessing achievement at only one educational level or for one course (e.g., Algebra I). Usually, they are stand-alone tests.
State-mandated customized tests: Developed by publishers of standardized multilevel survey batteries for use only in a particular state. These tests are customized and aligned with the state's standards for accountability purposes. They typically cover grades 3 through 12 and assess reading, language arts, mathematics, and science.
Non-standardized tests: Estimate students' status with respect to a well-defined domain of performance (usually specified by specific behavioral objectives), but they lack standardization and empirical documentation of worth. Textbook or curricular accompaniments fall into this category.
Teacher-Made Tests: Crafted by teachers to measure the specific learning targets that the curriculum emphasizes and help in making day-to-day instructional decisions.

State-Mandated and Customized Tests

State-mandated and customized tests must align with standardized exams and ensure students' improvement over time in alignment with standardized exams.

Non-Standardized Exams

When curriculum materials contain appealing assessment tasks, it's important not to accept their quality at face value. Review any tests or quizzes carefully for correctness, importance, and match to standards and learning targets, as these scores determine students' grades.

Standardized Tests Definition

A test is standardized when all students taking the test respond to the same set of carefully selected questions, allowing for comparisons among groups of students. Questions are typically multiple-choice or true-false to ensure fairness and objectivity in scoring. Creating and interpreting standardized test results requires expertise in curriculum, child development, cultural and linguistic differences, statistics, and psychometrics.

Examples of Standardized Tests

Common Core State Testing: Assesses students in math, reading, foreign language, economics, the arts, and physical education throughout the K-12 experience. Results influence teacher evaluation, education policy, and funding.
TerraNova: A series of student achievement tests by McGraw-Hill for students in K-12 to assess their understanding of reading, language arts, math, science, social studies, vocabulary, spelling, and other areas.
International Baccalaureate (IB) Exam: Important internationally and can play a role for students in IB schools in the U.S. It can replace the SAT and give students college credit in languages and other subjects.
Stanford Achievement Test (SAT 10): Produced by Pearson and used for assessing skills in reading comprehension, mathematics, problem-solving, language, spelling, listening comprehension, science, and social science. Less common as states develop their own standardized tests.
STAR Tests: Created by Renaissance Learning for use in K-12 education. They are computer-based and use adaptive technology to evaluate students in reading, early literacy, or math. Used to monitor student progress and prepare students for state and high-stakes tests.
PARCC (Partnership for Assessment of Readiness for College and Careers) and SBAC (Smarter Balanced Assessment Consortium) Tests: New computer-based tests that may rely on computer skills and require research, writing, and problem-solving, though they still include multiple-choice questions.
New York Performance Standards Consortium Test: Performance-based assessments are used instead of standardized tests, asking students to write essays and research papers, do science experiments, and create applied math problems. This approach has shown positive impacts, such as reduced dropout rates and increased college attendance.
Learning Record and Work Sampling System: Innovative assessments that draw on a student's work in class to measure progress, with teachers attaching scores to writing samples or science experiments.

Benefits of Standardized Tests

Standardization is necessary for results to be comparable from time to time, place to place, and person to person. A standardized assessment procedure allows for better interpretation of students' scores.

Empirical data demonstrates the validity and effectiveness of any assessment procedure, providing test developers with a basis for:

Improving and selecting tasks
Establishing reliability and validity
Describing how well the assessment works in the target population of students
Creating scales to measure growth
Equating scores (making scores comparable from grade to grade and from one form of the assessment to another)
Developing a variety of norm-referenced scores

Features of Standardized Tests

Test development features: Manuals and materials describe for each subtest the content and learning targets covered, types of norms and how they were developed, type of criterion-referencing provided, reliability data, and techniques used to screen items for offensiveness and possible gender, ethnic, and racial bias.
Test administration features: Tests generally have two equivalent forms, require a total administration time of 2 to 3 hours spread among several testing sessions over several school days, provide practice booklets for students to use before being tested, have separate, machine-scorable answer sheets for upper grades, and permit both in-level and out-of-level testing.
Test norming features: Tests generally use broadly representative national sampling for norms development and provide both fall and spring individual student norms. Sometimes special norms such as large-city norms, norms for students in special government entitlement programs, norms for high-income communities, norms for nonpublic schools, regional norms, and norms for school building averages are provided.
Test score features: Tests provide raw scores for each subtest and norm-referenced scores such as percentile ranks, normal curve equivalents, extended normalized standard scores, and grade-equivalents (or some similar grade-level indicator score).
Test score reporting and interpretation features: Tests generally have interpretive manuals for teachers, school administrators, and/or counselors. Most group tests provide computer-prepared narrative reports that contain summaries of district, school building, and classroom test results.

Uses of Standardized Tests

Describe the educational developmental levels of each student and use this information to modify or adapt teaching to accommodate individual students' needs.
Describe specific qualitative strengths and weaknesses in students and use this information to remediate deficiencies and capitalize on strengths.
Describe the extent to which a student has achieved the prerequisites needed to go on to new or advanced learning.
Combine these results with a student's classroom performance.
Describe students' achievement of specific learning targets and use students' performance on clusters of items to make immediate teaching changes.
Provide students with operational descriptions of what kinds and levels of performances are expected of them. Discuss these expectations with students and how you can work with them to fulfill them.
Provide students and parents with feedback about students' progress toward learning goals and use this information to establish a plan for home and school to work together.
Help school officials make decisions about needed curriculum or instructional changes. The results provide one important piece of information if school officials judge the tests to be relevant and important to the goals of the local community.
Results help educational evaluators compare the relative effectiveness of alternate methods of instruction and describe some of the factors mitigating their effectiveness.
They help educational researchers describe the relative effectiveness of innovations or experiments in education.
Test results also help school superintendents describe to school boards and other stakeholders the relative effectiveness of the local educational enterprise.

Criticisms of Standardized Tests

There are criticisms against Standardized Tests.

Misuse of Standardized Assessments

Placing a student in a special instructional program solely on results from a standardized achievement test.
Retaining a student in a grade solely on the results from a standardized test.
Judging an entire school program's quality solely on the basis of the results from a standardized achievement test.
Attributing a student's poor assessment results to only one cause.
School officials or parents trying to blame the teacher if the class does poorly on a standardized test.
Using a survey achievement battery to prescribe specific content teachers should teach at certain grade levels.

How to Administer Standardized Tests

You must prepare yourself and your students for taking a standardized assessment, whether this assessment is a paper-and-pencil test or a performance test.

Preparing your students requires students to be aware of the fact that they will be assessed, what they will be assessed on, the reasons for the assessment, and how their results will be used. Students should be prepared to do their best.
You must also be prepared to administer, and perhaps to mark, the assessments.