Foreign Language Teaching – Errors, Assessment & Feedback
Definitions & Core Terminology
- Assessment
- Ongoing measurement (German: "Leistungsmessung") that supplies information on how teaching/learning processes are progressing and how they can be improved.
- Diagnostic in nature → informs next instructional steps.
- Evaluation
- Decides how well something met a standard → grading / judging ("Leistungsbeurteilung").
- Summative snapshot of learning results.
- Testing ("Leistungsüberprüfung")
- Concrete, often time-bound instruments (exams, quizzes) that generate data for evaluation.
- Relationship (cf. Finkbeiner 2012)
- Testing feeds data ➜ Assessment interprets ➜ Evaluation judges.
Achievement as a Social & Pedagogical Parameter
- Performance Principle (Leistungsprinzip) – societal functions
- Regulates distribution of rewards, \text{status} \; & \; \text{money}.
- Encourages productivity, prosperity.
- Sorts individuals into social / professional positions.
- Pedagogical Performance Concept (pädagogischer Leistungsbegriff)
- Student’s right to individual care & holistic support.
- Focus on learner as person within a community, not a grade-producing entity (Jürgens & Sacher 2008).
- Tension: societal sorting vs. educational nurturing.
Functions of Assessment / Evaluation
- Diagnosis – identify strengths/weaknesses, prior knowledge, misconceptions.
- Information – for students, parents, teachers, system.
- Differentiation – place learners, group formation, streaming.
- Education / Motivation – foster metacognition, goal orientation.
- Reflective prompt in lecture: Recall memorable school tests → which functions dominated? What functions will future tests for you serve?
Reference Norms & Grading Criteria
- Norm-referenced (group/social referencing)
- Learner compared to peers ⇒ bell curve.
- Criterion-referenced (objective/task referencing)
- Fixed descriptors or standards (rubrics, CEFR can-do statements).
- Self/Individual-referenced
- Progress measured against learner’s own previous performance.
Quality Parameters of a Good Test
- Validity (Gültigkeit) – measures what it claims to measure.
- Reliability (Zuverlässigkeit) – produces stable & consistent results (e.g., r_{xx'} > 0.8 desirable).
- Objectivity (Objektivität) – scoring independent of examiner.
- Feasibility – realistic time, resources.
- Positive Washback – influence on teaching/learning.
Human / Subjective Factors Affecting Assessment
- Teacher’s prior knowledge & implicit bias.
- Student personality & cultural background.
- Interrelation of instruction & tested material.
- Test format & situational anxiety.
- Grading consequences → motivation / demotivation.
- Extenuating circumstances (illness, socio-emotional events).
- Summative
- When: end of unit/term.
- Purpose: certify mastery vs. benchmark.
- Examples: final exams, research paper.
- Formative
- When: continuously, during learning.
- Purpose: provide feedback to adjust teaching & learning strategies.
- Examples: concept maps, exit tickets, teacher comments.
- Complementarity: Both needed; formative assessments feed summative success.
Designing Language Tests – Cycle
- Preparation – identify objectives, standards, learner profiles.
- Design – choose item types, rubrics, weightings.
- Administration – logistics, instructions, security.
- Assessment/Scoring – apply criteria, ensure reliability.
- Reflection / Follow-up – analyse efficiency, appropriateness, washback; revise.
Aligning With Objectives (WHOOD model)
- What is to be tested? Skills (listening, speaking…) & competences (intercultural, strategic…).
- How will results be used? (diagnosis, certification).
- Objectives: general educational goals (critical thinking, autonomy).
- Objects: task types, language functions, text genres.
- Design constraints: time, resources, validity.
- Closed Tasks (minimal freedom)
- MCQs, true/false, matching, simple cloze.
- Pros: objective scoring; Cons: superficial competence.
- Semi-Open Tasks
- Guided gap-fill, guided dialogue, structured summary.
- Balance of objectivity & productive language.
- Open Tasks
- Essays, mediation, role-plays, projects ➜ simulate authentic communication.
Oral Task Bank (Eisenmann & Summer 2012)
- Presentation, mini-debate, free discussion, improvisation, interview, expert groups, storytelling, picture description, info-gap, interpreting.
- Variation ↔ fosters fluency, interaction, pragmatic skills.
Self- & Peer-Assessment Instruments
- Self-Monitoring Sheets
- Checklists on eye contact, body language, structure, visual aids, time management.
- Peer Feedback Rubrics (Likert 0–4)
- Content knowledge, organization, delivery, language accuracy, timing.
- Can-Do Descriptors (CEFR)
- Students state: “I can understand familiar words…” (A1) ➜ fosters goal clarity & autonomy.
Feedback – Evidence & Theory (Hattie & Timperley 2007)
- Meta-analysis effect size d=0.75 → one of the top ten influences on student achievement.
- Purpose: shrink gap between current & desired performance.
- Three Questions
- Where am I going? (Feed-Up: goals)
- How am I going? (Feed-Back: progress)
- Where to next? (Feed-Forward: strategies)
- Four Feedback Levels
- Task – correctness, completeness.
- Process – methods, strategies.
- Self-Regulation – planning, monitoring, evaluating one’s learning.
- Self – personal praise/affect (least effective alone).
- Sandwich Technique
- Positive comment
- Constructive critique (specific)
- Closing positive motivator.
Error Treatment in FL Classrooms
- Categories
- Slip – performance glitch, easily self-corrected if noticed.
- Mistake – lapse; learner knows the rule but momentarily fails.
- Error – competence gap; rule unknown.
- Timing & Focus
- Presentation/Practice phase → priority on form & accuracy.
- Production/Communication phase → priority on meaning & fluency; delayed correction.
- Oral Correction Techniques
- Recasts – teacher reformulates correctly.
- Clarification Requests – “Pardon?” prompts repair.
- Elicitation – teacher pauses, raises intonation; student supplies form.
- Postponed Correction – note errors, address after task.
- Written Correction
- Use error codes (VT = verb tense, WW = wrong word…).
- Distinguish gravity (Hughes & Lascaratou 1982) – global vs. local errors.
- Avoid “bleeding papers to death” ➜ selective, prioritized marking.
- Risk of No / Poor Correction
- Fossilization, demotivation, "imprecise unsystematic grading".
Practical Design Considerations (Grade 7 Examples)
- Closed / Semi-open / Open balance affects validity & learner expression.
- Compute overall grade transparently: weight content, language, form.
- Provide rubric-based comments rather than cryptic marks.
- Consider alternative grades (e.g., analytic rubric Content 40%+Language 40%+Layout 20%) for fairness.
Key Terms to Revisit
- Assessment, Evaluation, Testing
- Validity, Reliability, Objectivity
- Summative vs. Formative
- Norm-, Criterion-, Self-Referencing
- Task Typology (closed → open)
- Feedback (Feed-Up/Back/Forward; levels)
- Error vs. Mistake vs. Slip
Ethical / Philosophical Implications
- Balancing meritocratic sorting with pedagogical duty of care.
- Assessment literacy as teacher responsibility → equitable classrooms.
- Feedback not merely corrective but transformative ➜ empowers learner autonomy.
Real-World Connections
- CEFR widely used for workplace language certification.
- Hattie’s evidence base shapes educational policies worldwide (visible learning).
- Performance principle echoes debates on standardized testing & social mobility.
Study Checklist
- Be able to define & differentiate assessment vs. evaluation.
- Illustrate functions of tests with personal examples.
- Apply validity / reliability / objectivity to critique a test.
- Design a balanced test blueprint including open & closed tasks.
- Draft feedback at task, process, and self-regulation levels.
- Identify error categories & choose appropriate correction strategies.
Suggested Further Reading
- Grimm, Meyer, Volkmann (2015) Teaching English – task design cases.
- Hattie & Timperley (2007) – seminal feedback framework.
- Peñaflorida (2002) – learner autonomy via nontraditional assessment.