SiR - Lecture 7

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/33

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

34 Terms

1
New cards

Why do we evaluate HRI systems?

To ensure robots work, are effective, socially appropriate, and produce trustworthy scientific results.

2
New cards

What is explorative research? (definition)

Early-stage research to understand the problem space by “just putting a robot there” and observing.

3
New cards

What is formative evaluation?

Testing prototypes to shape and improve the design; done in the middle of development.

4
New cards

What is summative evaluation?

Rigorous evaluation at the end of a project to measure long-term or final effects.

5
New cards

What is system testing?

In-lab testing of system performance (latency, robustness, crashes).

6
New cards

What is a pilot study? (definition)

A small test run with real users to check feasibility, methodology, and fix issues.

7
New cards

What is feature testing?

Testing the contribution of one feature (e.g., with vs without emotional feedback).

8
New cards

What is implementation testing?

Testing long-term in real-world settings with high ecological validity.

9
New cards

Give examples of quantitative data.

Task time, accuracy, Likert scales, physiological signals.

10
New cards

What are descriptive statistics?

Means, medians, standard deviations.

11
New cards

What are inferential statistical methods?

T-tests, ANOVA, Chi-square, power analysis.

12
New cards

What data is analyzed qualitatively?

Interviews, open-ended responses, video notes.

13
New cards

What is thematic coding? (definition)

Identifying recurring ideas and patterns in qualitative data.

14
New cards

What is narrative analysis?

Analyzing participants’ stories to extract insights.

15
New cards

What is rigor in qualitative research?

Ensuring transcription accuracy and inter-rater reliability.

16
New cards

What are mixed-methods?

Combining quantitative + qualitative data for a complete picture.

17
New cards

What is an observational study? (definition)

Watching natural interactions without manipulating anything.

18
New cards

What is the Hawthorne effect?

People change behavior when they know they are being observed.

19
New cards

What is a between-group design?

Different participants in each condition.
Pros: no order effects
Cons: needs larger sample

20
New cards

What is a within-group design?

Same participants in all conditions.
Pros: controls individual differences
Cons: carryover + fatigue

21
New cards

What is an RCT? (definition)

(Randomized Controlled Trials) = Randomized study comparing intervention vs control; gold standard for causality.

22
New cards

What is a longitudinal study?

Study conducted for weeks, months, or years.
Measures adaptation, long-term acceptance, novelty fade.

23
New cards

What are self-assessments?

Surveys measuring user feelings (trust, valence, usability).

24
New cards

What are behavioral observations?

Video/live coding of gaze, gesture, engagement.

25
New cards

What are psychophysiological measures?

Heart rate variability, skin conductance, respiration.

26
New cards

What are task performance metrics?

Completion time, errors, success rate.

27
New cards

What is pipeline performance?

Measuring ASR → LLM → DM → TTS for latency, stability, cost.

28
New cards

What is perplexity?

Metric showing how well a model predicts next words; lower = better.

29
New cards

What is BLEU score?

Measures n-gram overlap between output and reference text.

30
New cards

What is ROUGE?

Recall-focused metric measuring how much key content is matched in summaries.

31
New cards

What is simulated user testing?

Testing robustness using persona scripts, happy path, and edge cases.

32
New cards

What do humans judge LLM responses on?

Usability, relevance, satisfaction.

33
New cards

Why are human-centric metrics important?

They capture insights automated metrics cannot.

34
New cards

What are the steps in an evaluation plan?

  1. Select capability

  2. Define outcome

  3. Choose metrics

  4. Choose instruments

  5. Define protocol

  6. Execute