SiR - Lecture 7

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/33

Earn XP

Description and Tags

Socially Intelligent Robotics

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

34 Terms

New cards

Why do we evaluate HRI systems?

To ensure robots work, are effective, socially appropriate, and produce trustworthy scientific results.

New cards

What is explorative research? (definition)

Early-stage research to understand the problem space by “just putting a robot there” and observing.

New cards

What is formative evaluation?

Testing prototypes to shape and improve the design; done in the middle of development.

New cards

What is summative evaluation?

Rigorous evaluation at the end of a project to measure long-term or final effects.

New cards

What is system testing?

In-lab testing of system performance (latency, robustness, crashes).

New cards

What is a pilot study? (definition)

A small test run with real users to check feasibility, methodology, and fix issues.

New cards

What is feature testing?

Testing the contribution of one feature (e.g., with vs without emotional feedback).

New cards

What is implementation testing?

Testing long-term in real-world settings with high ecological validity.

New cards

Give examples of quantitative data.

Task time, accuracy, Likert scales, physiological signals.

New cards

What are descriptive statistics?

Means, medians, standard deviations.

New cards

What are inferential statistical methods?

T-tests, ANOVA, Chi-square, power analysis.

New cards

What data is analyzed qualitatively?

Interviews, open-ended responses, video notes.

New cards

What is thematic coding? (definition)

Identifying recurring ideas and patterns in qualitative data.

New cards

What is narrative analysis?

Analyzing participants’ stories to extract insights.

New cards

What is rigor in qualitative research?

Ensuring transcription accuracy and inter-rater reliability.

New cards

What are mixed-methods?

Combining quantitative + qualitative data for a complete picture.

New cards

What is an observational study? (definition)

Watching natural interactions without manipulating anything.

New cards

What is the Hawthorne effect?

People change behavior when they know they are being observed.

New cards

What is a between-group design?

Different participants in each condition.
Pros: no order effects
Cons: needs larger sample

New cards

What is a within-group design?

Same participants in all conditions.
Pros: controls individual differences
Cons: carryover + fatigue

New cards

What is an RCT? (definition)

(Randomized Controlled Trials) = Randomized study comparing intervention vs control; gold standard for causality.

New cards

What is a longitudinal study?

Study conducted for weeks, months, or years.
Measures adaptation, long-term acceptance, novelty fade.

New cards

What are self-assessments?

Surveys measuring user feelings (trust, valence, usability).

New cards

What are behavioral observations?

Video/live coding of gaze, gesture, engagement.

New cards

What are psychophysiological measures?

Heart rate variability, skin conductance, respiration.

New cards

What are task performance metrics?

Completion time, errors, success rate.

New cards

What is pipeline performance?

Measuring ASR → LLM → DM → TTS for latency, stability, cost.

New cards

What is perplexity?

Metric showing how well a model predicts next words; lower = better.

New cards

What is BLEU score?

Measures n-gram overlap between output and reference text.

New cards

What is ROUGE?

Recall-focused metric measuring how much key content is matched in summaries.

New cards

What is simulated user testing?

Testing robustness using persona scripts, happy path, and edge cases.

New cards

What do humans judge LLM responses on?

Usability, relevance, satisfaction.

New cards

Why are human-centric metrics important?

They capture insights automated metrics cannot.

New cards

What are the steps in an evaluation plan?

Select capability
Define outcome
Choose metrics
Choose instruments
Define protocol
Execute

Explore top notes

French -er Verbs

Updated 1171d ago

Note

Chapter 25: Tests for Gases, Anions and Cations

Updated 949d ago

Note

The Cultural Landscape Chapter 5: Language

Updated 1171d ago

Note

Chapter 5: Chemical Energetics

Updated 564d ago

Note

Issues and Debates - revision refresh

Updated 545d ago

Note

Biology: Exploring the Fundamental Concepts and Topics

Updated 513d ago

Note

ENDOCRINE SYSTEM

Updated 996d ago

Note

AP World History - Unit 3: Land-Based Empires

Updated 963d ago

Note

French -er Verbs

Updated 1171d ago

Note

Chapter 25: Tests for Gases, Anions and Cations

Updated 949d ago

Note

The Cultural Landscape Chapter 5: Language

Updated 1171d ago

Note

Chapter 5: Chemical Energetics

Updated 564d ago

Note

Issues and Debates - revision refresh

Updated 545d ago

Note

Biology: Exploring the Fundamental Concepts and Topics

Updated 513d ago

Note

ENDOCRINE SYSTEM

Updated 996d ago

Note

AP World History - Unit 3: Land-Based Empires

Updated 963d ago

Note

Explore top flashcards

Flashcards (41)

Flashcards (50)

the word within the word list 1.2

Updated 447d ago

Flashcards (25)

APUSH Period 5 People

Updated 376d ago

Flashcards (35)

Politik och statsskick del 1

Updated 71d ago

Flashcards (20)

Which branch of Government?

Updated 776d ago

Flashcards (28)

Den Industrielle Revolusjon

Updated 821d ago

Flashcards (40)

¡Qué chévere! Unidad 6B

Flashcards (60)

Flashcards (41)

Flashcards (50)

the word within the word list 1.2

Updated 447d ago

Flashcards (25)

APUSH Period 5 People

Updated 376d ago

Flashcards (35)

Politik och statsskick del 1

Updated 71d ago

Flashcards (20)

Which branch of Government?

Updated 776d ago

Flashcards (28)

Den Industrielle Revolusjon

Updated 821d ago

Flashcards (40)

¡Qué chévere! Unidad 6B

Updated 1090d ago

Flashcards (60)