Comprehensive Study Notes – Virtual Reality Fundamentals

Historical Progression of Media

  • Human communication has evolved through successive media technologies.
    • Cave wall paintings (e.g.
    • Bull-fighting scene, horse scene at Lascaux, 15,00013,000B.C.15{,}000\text{–}13{,}000\,\text{B.C.})
    • Purpose: record hunting episodes, share communal history.
    • Medium: natural pigments on stone; enclosed physical space.
  • Each new medium extends the ways ideas can be conveyed and experienced, culminating (for now) in Virtual Reality (VR).

What Is Virtual Reality (VR)?

  • General Definition
    • An artificial environment delivered via computer-generated sensory stimuli (visual, auditory, etc.) where the user’s actions partially determine what happens (interactive). — Merriam-Webster 2015.
    • A means for humans to visualise, manipulate, and interact with complex data through multiple sensorial channels (multimodal interface).
  • Core Characteristics
    • Users sense and interact with three-dimensional content rather than static pictures or movies.
    • Strong sensation of spatial presence relative to the user.
    • Environment can replicate reality or be entirely imaginary.
  • Objective of VR
    • Create a believable environment that does not physically exist.
    • Hide traditional computer interfaces; users act on objects “inside” the world rather than controlling them from the outside.

VR vs. Conventional Multimedia (MM)

  • Interface Visibility
    • VR:\text{VR}: sophisticated, hidden.
    • MM:\text{MM}: basic, visible.
  • Sensory Channels
    • VR:\text{VR}: multimodal (visual, audio, haptics, etc.).
    • MM:\text{MM}: mainly audio & video.
  • User Perception
    • VR:\text{VR}: believable, immersive.
    • MM:\text{MM}: often unconvincing, shallow.

Multimodal Interfaces

  • VR effectiveness rises with the number & quality of sensory modalities stimulated.
  • Four principal modalities and associated technologies/challenges:
    • Visual: stereoscopy, wide FOV HMDs, real-time lighting & shading for realism.
    • Auditory: spatialised audio, sound design cues for material, distance, and ambience.
    • Tactile: haptic gloves, vests, exoskeletons; convey force, texture, temperature.
    • Olfactory: scent emitters; challenges include latency, lingering odours, personal sensitivity.

Classic VR System Architecture

  • Five foundational components
    1. VR Engine – real-time simulation & rendering core.
    2. Software & Databases – 3-D object modelling, physics, AI, scripting languages.
    3. I/O Devices – displays, trackers, haptics, audio, smell generators.
    4. User – physiological & psychological factors influencing comfort & safety.
    5. Tasks/Applications – medical, education, arts & entertainment, military, etc.
  • Architecture must satisfy high I/O bandwidth and low-latency computation demands.

Taxonomy of VR Systems

1. Augmented Reality (AR)
  • Combines live views of the real world with computer-generated overlays.
  • Head-tracked graphics maintain correct perspective; GPS/IMUs locate user.
  • Applications: gaming, education, maintenance, medicine, defence.
2. Telepresence
  • Technologies enabling users to feel or act at a remote physical site via telerobotics.
  • Uses high-quality video, haptics, & control loops; fields: tele-surgery, remote security, conferencing.
3. Fish-Tank VR (FTVR)
  • Small-scale immersive display where head position relative to a monitor defines the rendered perspective.
  • Can be monocular or stereoscopic.
  • Variants: Workbench, CAVE.
    • CAVE (Cave Automatic Virtual Environment): room-sized, stereo projection on walls/floor; full-body tracking, collaborative use.
4. Desktop VR (Window-on-World)
  • Simplest form: VR content on a standard monitor with mouse/keyboard or tethered HMD to a PC.
  • Lower cost, limited immersion.
Comparative Highlights
  • AR vs. Telepresence: AR augments surroundings; telepresence transports you to another real site.
  • Fish-Tank vs. Desktop: Fish-Tank tied to PC & sensors (higher fidelity, less mobility); Desktop VR often standalone headsets (higher mobility, costlier upfront).

Key Elements in the VR Experience

Virtual Reality Triangle – “I³”
  • Interaction, Immersion, Imagination are mutually reinforcing foundations.
1. Interaction
  • Goal: natural, intuitive Human-Computer Interaction (HCI) that feels like human-to-human conversation.
  • Requires new interface paradigms: gesture, voice, eye-tracking, brain-computer interfaces.
  • Feedback (visual/haptic/audio) must mirror real-world causality.
2. Imagination
  • Application’s ability to solve real problems via creative world-building and novel affordances.
  • Enables sensations unattainable in physical reality (e.g.
    flying, time dilation).
3. Immersion (Objective Property of Technology)
  • Degree to which system projects stimuli onto user’s receptors.
  • Six measurable dimensions (Slater’s model)
    1. Extensive – number of modalities engaged.
    2. Matching – sensorimotor congruence (visuals match head motion, body replication).
    3. Surrounding – panoramic coverage; wide FOV, 360° audio.
    4. Vividness – resolution, frame rate, lighting, audio bitrate.
    5. Plot-Performing – coherent unfolding narrative, consistent world behaviour.
    6. Interactability – ability to effect change & receive contingent responses.
  • Tech can lead the mind but cannot guarantee user acceptance.
4. Presence (Subjective User State)
  • “Sense of being there” inside the mediated space, with temporary amnesia of the real world.
  • Function of both immersion (tech) and individual user traits.
  • Break-in-Presence (BIP): moment when illusion collapses (e.g.
    tracking glitch, discomfort).
Illusions Underpinning Presence (4 Components)
  1. Stable Spatial Place – congruent depth cues & low latency sustain belief in a fixed location.
  2. Self-Embodiment – virtual body that reliably mirrors the user’s movements; mismatch causes BIP.
  3. Physical Interaction – tactile/force feedback or convincing substitutes; lack of response ruins illusion.
  4. Social Communication – believable verbal & non-verbal interaction with human or AI agents; efficacy rises with behavioural realism, not necessarily visual fidelity.
5. Reality Trade-Off
  • Balancing realism/immersion against cost, performance, accessibility, dev-time, interaction & artistic style.
  • Key pairwise considerations:
    • Realism  vs.  Performance\text{Realism} \; vs. \; \text{Performance} – high poly counts ↔ frame-rate.
    • Realism  vs.  Accessibility\text{Realism} \; vs. \; \text{Accessibility} – high-end GPU ↔ wider audience.
    • Realism  vs.  Dev Time\text{Realism} \; vs. \; \text{Dev Time} – photoreal assets take longer.
    • Realism  vs.  Interaction Freedom\text{Realism} \; vs. \; \text{Interaction Freedom} – scripted cinematics vs.
      emergent gameplay.
    • Realism  vs.  Artistic Style\text{Realism} \; vs. \; \text{Artistic Style} – stylised art can circumvent uncanny valley.

Uncanny Valley & VR

  • Masahiro Mori (1970): human emotional response rises with human-likeness until a point where small imperfections cause revulsion.
  • In VR:
    • Nearly-human avatars with stiff facial animation, odd eye gaze or mismatched voice disrupt presence.
    • Cartoon or deliberately stylised characters often elicit higher comfort and acceptance.
  • Mitigations: exaggeration, stylisation, quality motion capture, and focusing on behavioural realism.

Fidelity Continua

  • VR presence does not require photorealism; different experiences sit at different points on several continua.
  • Five low-to-high scales often assessed:
    1. Visual Fidelity – abstraction → photoreal.
    2. Audio Fidelity – simple stereo → spatialised, high-bit-rate ambisonics.
    3. Interaction Fidelity – button press → one-to-one physical action.
    4. Behavioral Fidelity – scripted → physics/AI-driven realism.
    5. Temporal Fidelity – high latency → sub-20ms20\,\mathrm{ms} motion-to-photon.
  • Three compound fidelity axes for designers (Sherman & Craig):
    • Representation Fidelity – realism of place/objects (earth-like ↔ abstract).
    • Interaction Fidelity – similarity between physical & virtual action mapping.
    • Experiential Fidelity – alignment between creator’s intended and user-perceived experience.
  • Choosing positions along continua depends on project goals; extremes are not inherently “better”.

Practical & Ethical Implications

  • Medical therapy (pain diversion, phobia treatment) leverages presence; efficacy declines with BIP.
  • Training simulators (aircraft, dental, military) require high interaction fidelity to avoid negative transfer.
  • Privacy & psychological risks: deeper presence can intensify emotional impact, anxiety, or trauma.
  • Design ethics: avoid deceptive manipulation; ensure accessibility, comfort, and inclusivity (e.g.
    body diversity, motion-sickness mitigation).

Numerical & Statistical Highlights

  • Lascaux cave age: 15,00013,000years BC\approx15{,}000\text{–}13{,}000\,\text{years BC}.
  • Acceptable VR motion-to-photon latency threshold: <20\,\text{ms} to minimise nausea.
  • Common CAVE tracking volume: 3×3×3m33\times3\times3\,\text{m}^3 (varies by installation).

Key Take-Home Equations & Heuristics

  • Presence Function (conceptual):
    Presence=f(Immersion,User Traits)\text{Presence} = f(\text{Immersion},\,\text{User Traits})
  • Greater immersion usually raises potential presence but subject to diminishing returns and Uncanny Valley threshold.
  • Reality Trade-Off curve: as realism RR increases, cost CC rises non-linearly, often CR2C \propto R^{2}.