Comprehensive Study Notes – Virtual Reality Fundamentals
- Human communication has evolved through successive media technologies.
- Cave wall paintings (e.g.
- Bull-fighting scene, horse scene at Lascaux, 15,000–13,000B.C.)
- Purpose: record hunting episodes, share communal history.
- Medium: natural pigments on stone; enclosed physical space.
- Each new medium extends the ways ideas can be conveyed and experienced, culminating (for now) in Virtual Reality (VR).
What Is Virtual Reality (VR)?
- General Definition
- An artificial environment delivered via computer-generated sensory stimuli (visual, auditory, etc.) where the user’s actions partially determine what happens (interactive). — Merriam-Webster 2015.
- A means for humans to visualise, manipulate, and interact with complex data through multiple sensorial channels (multimodal interface).
- Core Characteristics
- Users sense and interact with three-dimensional content rather than static pictures or movies.
- Strong sensation of spatial presence relative to the user.
- Environment can replicate reality or be entirely imaginary.
- Objective of VR
- Create a believable environment that does not physically exist.
- Hide traditional computer interfaces; users act on objects “inside” the world rather than controlling them from the outside.
- Interface Visibility
- VR: sophisticated, hidden.
- MM: basic, visible.
- Sensory Channels
- VR: multimodal (visual, audio, haptics, etc.).
- MM: mainly audio & video.
- User Perception
- VR: believable, immersive.
- MM: often unconvincing, shallow.
Multimodal Interfaces
- VR effectiveness rises with the number & quality of sensory modalities stimulated.
- Four principal modalities and associated technologies/challenges:
- Visual: stereoscopy, wide FOV HMDs, real-time lighting & shading for realism.
- Auditory: spatialised audio, sound design cues for material, distance, and ambience.
- Tactile: haptic gloves, vests, exoskeletons; convey force, texture, temperature.
- Olfactory: scent emitters; challenges include latency, lingering odours, personal sensitivity.
Classic VR System Architecture
- Five foundational components
- VR Engine – real-time simulation & rendering core.
- Software & Databases – 3-D object modelling, physics, AI, scripting languages.
- I/O Devices – displays, trackers, haptics, audio, smell generators.
- User – physiological & psychological factors influencing comfort & safety.
- Tasks/Applications – medical, education, arts & entertainment, military, etc.
- Architecture must satisfy high I/O bandwidth and low-latency computation demands.
Taxonomy of VR Systems
1. Augmented Reality (AR)
- Combines live views of the real world with computer-generated overlays.
- Head-tracked graphics maintain correct perspective; GPS/IMUs locate user.
- Applications: gaming, education, maintenance, medicine, defence.
2. Telepresence
- Technologies enabling users to feel or act at a remote physical site via telerobotics.
- Uses high-quality video, haptics, & control loops; fields: tele-surgery, remote security, conferencing.
3. Fish-Tank VR (FTVR)
- Small-scale immersive display where head position relative to a monitor defines the rendered perspective.
- Can be monocular or stereoscopic.
- Variants: Workbench, CAVE.
- CAVE (Cave Automatic Virtual Environment): room-sized, stereo projection on walls/floor; full-body tracking, collaborative use.
4. Desktop VR (Window-on-World)
- Simplest form: VR content on a standard monitor with mouse/keyboard or tethered HMD to a PC.
- Lower cost, limited immersion.
Comparative Highlights
- AR vs. Telepresence: AR augments surroundings; telepresence transports you to another real site.
- Fish-Tank vs. Desktop: Fish-Tank tied to PC & sensors (higher fidelity, less mobility); Desktop VR often standalone headsets (higher mobility, costlier upfront).
Key Elements in the VR Experience
Virtual Reality Triangle – “I³”
- Interaction, Immersion, Imagination are mutually reinforcing foundations.
1. Interaction
- Goal: natural, intuitive Human-Computer Interaction (HCI) that feels like human-to-human conversation.
- Requires new interface paradigms: gesture, voice, eye-tracking, brain-computer interfaces.
- Feedback (visual/haptic/audio) must mirror real-world causality.
2. Imagination
- Application’s ability to solve real problems via creative world-building and novel affordances.
- Enables sensations unattainable in physical reality (e.g.
flying, time dilation).
3. Immersion (Objective Property of Technology)
- Degree to which system projects stimuli onto user’s receptors.
- Six measurable dimensions (Slater’s model)
- Extensive – number of modalities engaged.
- Matching – sensorimotor congruence (visuals match head motion, body replication).
- Surrounding – panoramic coverage; wide FOV, 360° audio.
- Vividness – resolution, frame rate, lighting, audio bitrate.
- Plot-Performing – coherent unfolding narrative, consistent world behaviour.
- Interactability – ability to effect change & receive contingent responses.
- Tech can lead the mind but cannot guarantee user acceptance.
4. Presence (Subjective User State)
- “Sense of being there” inside the mediated space, with temporary amnesia of the real world.
- Function of both immersion (tech) and individual user traits.
- Break-in-Presence (BIP): moment when illusion collapses (e.g.
tracking glitch, discomfort).
Illusions Underpinning Presence (4 Components)
- Stable Spatial Place – congruent depth cues & low latency sustain belief in a fixed location.
- Self-Embodiment – virtual body that reliably mirrors the user’s movements; mismatch causes BIP.
- Physical Interaction – tactile/force feedback or convincing substitutes; lack of response ruins illusion.
- Social Communication – believable verbal & non-verbal interaction with human or AI agents; efficacy rises with behavioural realism, not necessarily visual fidelity.
5. Reality Trade-Off
- Balancing realism/immersion against cost, performance, accessibility, dev-time, interaction & artistic style.
- Key pairwise considerations:
- Realismvs.Performance – high poly counts ↔ frame-rate.
- Realismvs.Accessibility – high-end GPU ↔ wider audience.
- Realismvs.Dev Time – photoreal assets take longer.
- Realismvs.Interaction Freedom – scripted cinematics vs.
emergent gameplay. - Realismvs.Artistic Style – stylised art can circumvent uncanny valley.
Uncanny Valley & VR
- Masahiro Mori (1970): human emotional response rises with human-likeness until a point where small imperfections cause revulsion.
- In VR:
- Nearly-human avatars with stiff facial animation, odd eye gaze or mismatched voice disrupt presence.
- Cartoon or deliberately stylised characters often elicit higher comfort and acceptance.
- Mitigations: exaggeration, stylisation, quality motion capture, and focusing on behavioural realism.
Fidelity Continua
- VR presence does not require photorealism; different experiences sit at different points on several continua.
- Five low-to-high scales often assessed:
- Visual Fidelity – abstraction → photoreal.
- Audio Fidelity – simple stereo → spatialised, high-bit-rate ambisonics.
- Interaction Fidelity – button press → one-to-one physical action.
- Behavioral Fidelity – scripted → physics/AI-driven realism.
- Temporal Fidelity – high latency → sub-20ms motion-to-photon.
- Three compound fidelity axes for designers (Sherman & Craig):
- Representation Fidelity – realism of place/objects (earth-like ↔ abstract).
- Interaction Fidelity – similarity between physical & virtual action mapping.
- Experiential Fidelity – alignment between creator’s intended and user-perceived experience.
- Choosing positions along continua depends on project goals; extremes are not inherently “better”.
Practical & Ethical Implications
- Medical therapy (pain diversion, phobia treatment) leverages presence; efficacy declines with BIP.
- Training simulators (aircraft, dental, military) require high interaction fidelity to avoid negative transfer.
- Privacy & psychological risks: deeper presence can intensify emotional impact, anxiety, or trauma.
- Design ethics: avoid deceptive manipulation; ensure accessibility, comfort, and inclusivity (e.g.
body diversity, motion-sickness mitigation).
Numerical & Statistical Highlights
- Lascaux cave age: ≈15,000–13,000years BC.
- Acceptable VR motion-to-photon latency threshold: <20\,\text{ms} to minimise nausea.
- Common CAVE tracking volume: 3×3×3m3 (varies by installation).
Key Take-Home Equations & Heuristics
- Presence Function (conceptual):
Presence=f(Immersion,User Traits) - Greater immersion usually raises potential presence but subject to diminishing returns and Uncanny Valley threshold.
- Reality Trade-Off curve: as realism R increases, cost C rises non-linearly, often C∝R2.