Assessing Reading – Comprehensive Notes

Reading in Language Assessment

  • Written language remains crucial for information transfer, entertainment, social/legal codification, despite growth of visual & auditory media.
  • In literate societies most children read by age 5–6; reading often taken for granted.
  • Foreign-language programs presume reading; most standardized tests use written stimuli; even oral exams embed reading segments.
  • Reading is paramount for academic success; therefore assessment of reading is central to general language ability testing.
  • Two hurdles for L2 readers:
    • Master bottom-up decoding (letters → words → phrases) AND top-down comprehension strategies.
    • Build content & formal schemata (background + genre knowledge).
  • Assessment must go beyond comprehension products to include strategic pathways; failure may stem from weak strategies (e.g.
    discourse conventions in technical reports).
  • Reading is unobservable; all evaluation is inferential—can only assess via external tasks.

Genres (Types) of Reading

  • Each genre has its own conventions; anticipating these boosts efficiency.
  • Abridged list (forms part of test specifications):
    • Academic: journal articles, lab reports, reference works, textbooks/theses, essays, test directions, editorials.
    • Job-related: messages, emails, memos, evaluations, schedules, forms, financial docs, manuals.
    • Personal: newspapers, letters/greeting cards, lists & notes, travel schedules, recipes/menus/maps, ads, fiction & poetry, financial docs, medical/immigration forms, comics.
  • Content validity hinges on genre choice (e.g.
    tourism students → guides, maps, schedules).

Micro- & Macro-Skills + Strategies

  • Microskills (1-7): discriminate graphemes, short-term memory chunks, rapid rate, recognize word classes/systems, alternative grammatical realizations, cohesion.
  • Macroskills (8-14): recognize rhetorical forms & communicative functions, infer implicit context, link events/ideas (cause-effect, main/supporting), distinguish literal vs implied meaning, decode cultural references, deploy strategy battery (skimming, scanning, discourse markers, lexical inference, schemata activation).
  • Strategy taxonomy (sample): identify purpose, apply spelling rules, lexical analysis (prefix/root/suffix), guess meaning, skim for gist, scan for specifics, rapid silent reading, use graphic organizers, parse literal vs implied, exploit discourse markers.

Four Types of Reading Performance

TypeLengthFocusProcessing Emphasis
Perceptivevery shortform (letters/words)bottom-up
Selectiveshort-mediumlexico-grammatical featuresmix
Interactiveparagraph-pagemeaning + formtop-down > bottom-up
Extensive>1 page to booksglobal meaningtop-down
(See Figure 8.1: dots showing strong ⚫ vs moderate emphasis.)

Designing Assessment Tasks

Perceptive Reading

  • Goal: literacy fundamentals—recognize letters, punctuation, grapheme-phoneme links.
  • Typical tasks:
    • Reading aloud (letters/words/sentences).
    • Written reproduction.
    • Multiple-choice/same-different/minimal-pairs; grapheme recognition.
    • Picture-cued word or sentence identification; T/F, matching, MC picture selection.
  • Cautions: separate reading vs writing errors in written response.

Selective Reading

  • Focus on form (vocabulary, grammar, some discourse).
  • Common formats:
    • MC vocabulary/grammar (context-free or contextualized; can use rational cloze).
    • Matching word ↔ definition, or sentence-fill matching; pragmatic label matching.
    • Editing single-sentence errors (MC underlined-part selection).
    • Picture-cued lexical tests at higher complexity.
    • Gap-fill / sentence-completion (beware writing confound; hard to score).

Interactive Reading

  • Text length: paragraph(s); tasks combine meaning & form.
Cloze
  • Deletion every 7±2 words (fixed-ratio) OR rational deletions (grammar/discourse).
  • Scoring: exact-word vs appropriate-word; trade-off reliability vs face validity.
  • Variants: C-test (obliterate 2nd ½ of every other word); cloze-elide (insert intruders).
Impromptu reading + comprehension Qs
  • Classic "read passage, answer MC"; TOEFL™ specs test: main idea, vocab-in-context, inference, detail, exclusion, grammar reference, supporting ideas.
  • Computer-based extras: click on reference word, place sentence, choose graphic.
Short-answer (open-ended)
  • Easier to write than MC; need rubrics & consistent scoring.
Contextualized Editing (multi-sentence)
  • E.g.
    Imao (2001): 32–56 items, one error per sentence; MC underlined error; diagnostic sub-scores (sentence structure, verb tense, r=.76r=.76 correlation to other skills).
Scanning
  • Provide article/table etc.; locate names, dates, stats (e.g.
    p<.05).
  • Timing may be part of score.
Ordering / Strip-story
  • Arrange sentences logically; multiple valid orders possible—better as formative.
Information-transfer (graphics)
  • Interpret charts, maps, family trees, menus, stock tables.
  • Task types: identify info, elaborate, infer, match passage ↔ graphic, create graphic from reading.

Extensive Reading

  • Texts: journal articles, essays, reports, short stories, books.
  • Assessments integrate reading with writing/speaking; emphasize global meaning.
Skimming Tests
  • Learner scans long text rapidly; questions on main idea, genre, purpose, difficulty, utility.
  • Mostly formative, high washback.
Summarizing
  • Write 100-150-word synopsis (main + supporting ideas).
  • Imao rubric: accuracy, own words, organization, language clarity.
  • Holistic reading-comprehension scale (3-0).
Responding / Critique
  • Write opinion essay agreeing/disagreeing with article; judge accuracy of content reflection.
Note-taking / Outlining
  • Evaluate learners’ marginal notes or outlines; informal diagnostic of strategies.

Validity, Reliability, Practicality, Washback & Ethics

  • Reading tests are inferential; must triangulate tasks to cover micro & macro skills.
  • Content validity via relevant genres and objectives; construct validity via strategic components.
  • Reliability increased with clear rubrics (e.g.
    exact-word cloze, MC grid scoring).
  • Practicality: MC & computer-scored formats efficient; open-ended require trained raters.
  • Washback: tasks like contextualized editing or strip-story promote classroom discussion & strategy training.
  • Ethical considerations:
    • Avoid cultural bias in texts & graphics.
    • Provide accommodations for learning disabilities.
    • Be transparent on scoring criteria; respect formative vs summative stakes.

Connections & Implications

  • Builds on previous chapters parallel to listening/speaking typologies (perceptive → extensive).
  • Reinforces foundational psycholinguistic principles (bottom-up vs top-down, schemata theory).
  • Mirrors real-world literacy demands: academic study, workplace documents, personal survival.
  • Highlights interdisciplinary relevance (e.g.
    statistical tables require numeracy).
  • Supports holistic curriculum design tying reading tasks to writing, speaking, critical thinking.

Key Numerical & Statistical References

  • TOEFL study: correlation between error-detection and listening r=.58r=.58; error-detection and reading r=.76r=.76.
  • Example significance in tables: p<.05 indicating statistical reliability in research graphics.