Mixed Reality Lecture Notes
Mixed Reality for Immersive Experience
Immersive Technologies
- Immersive technologies include:
- Augmented Reality (AR)
- Virtual Reality (VR)
- Mixed Reality (MR)
- Virtual Environment
Mixed Reality (MR)
- MR combines real environment with virtual reality and augmented reality.
- Reference: Milgram et al., "A Taxonomy of Mixed Reality Visual Displays," 1994, IEICE Trans. Information Systems.
Lecture Coverage
- Marker-based Augmented Reality (AR)
- Markerless Augmented Reality (AR)
- Visual SLAM
Marker-based Augmented Reality
- Process:
- Video stream from camera.
- Image converted to binary image.
- Black marker identified.
Marker Detection
- Why is it easy to detect the marker?
- Simple computation.
- Relies on edge and corner detection.
- Surface color discontinuity.
- Illumination discontinuity.
Marker Pose Calculation
- Position and orientation of the marker relative to the camera are calculated.
- T=P,R
- 3D Transformation = {position and orientation}
Virtual Object Rendering
- Use the transformation (T) of the marker to position and orient the 3D virtual object.
- T=P,R
- The virtual object is rendered in the video frame.
- Marker registration.
- Augmentation in origin.
- Marker tracking.
- Marker coordinates: (X<em>m,Y</em>m,Zm)
- Camera coordinates: (X<em>c,Y</em>c,Zc)
- Ideal screen coordinates: (x<em>c,y</em>c)
- Image distortion function: (x<em>d,y</em>d)
- Observed screen coordinates.
- Registration error: incorrect pose (localisation and orientation) estimation during the tracking process.
Advantages of Marker-based AR
- Easy to use and implement.
- Efficient and real-time performance (low latency).
- Feature-based tracking, which is very stable.
Disadvantages of Marker-based AR
- If the camera moves away from the marker, the virtual content disappears.
- Does not work with reflected light.
- Marker must have strong borders and contrast.
- Does not work with occlusion.
Image-based Augmented Reality
- Using the marker as an image.
- Feature detection algorithm.
- Marker-based Revisited: Continuous tracking and tracking stability
Challenges in Image-based AR
- Continuous tracking and tracking stability are challenging.
- Keeps continuous track of feature points in each frame with respect to the next frame.
- Keeps continuous track of image pose over time, thus detects outliers (pose calculation/pose estimation).
- If the frame rate is slow, the pose may change significantly between frames (augmentation “jumps”).
Image-based AR Process
- Video stream from camera.
- Continuous feature detection.
- Pose calculation.
- Use the transformation (T) of the marker to position and orient the 3D virtual object.
- Target image registration.
- The virtual object is rendered in the video frame.
- Augmentation in real world.
- T=P,R
Marker-less Augmented Reality
- Optical Tracking
- Marker tracking (e.g. ARToolKit square markers or known features in an image).
- Available for more than 10 years.
Marker-less AR - Optical Tracking Types
- Unprepared tracking: tracking in unknown environment (e.g. visual SLAM tracking).
- SLAM (Simultaneous Localization and Mapping): this is a very important problem in mobile robotics.
Visual SLAM
- Early SLAM system (1986-now).
- Using computer vision and sensors.
- Using cameras only, such as stereo view.
- MonoSLAM (single camera) developed in 2007.
Visual SLAM Steps
- Step 1: Tracking a set of points through camera frames.
- Step 2: Using these tracks to triangulate their 3D position.
- Step 3: Simultaneously use the estimated point location to calculate the camera which could have observed them.
- Observing enough points can solve both structure and motion (camera path and scene structure).
Challenges for Visual SLAM
- Camera moves through an unchanged scene.
- Not suitable for person tracking, gesture tracking.
- Outdoor tracking.
Mixed Reality Components using HoloLens Example
- See-through display.
- Aspect Ratio: 3:2
- Resolution: 2K
- Display Rate: 120-240Hz
- Depth camera
- Image sensors
- Short and long-throw IR illuminators
- Inertial Measurement Unit (IMU)
- Light Engine
- Color video camera
- 4 gray-scale cameras
- See-through Holographic Lens
Sensors Calibration
- Intrinsic properties (Optical Centre, scaling):
[f<em>x0C</em>x][0f<em>yC</em>y]
- Estimates the camera parameters.
- Extrinsic properties (Camera Rotation and translation):
[r<em>11r</em>12amp;r<em>13t</em>1 r<em>21r</em>22amp;r<em>23t</em>2 r<em>31r</em>32amp;r<em>33t</em>3]
[a<em>11a</em>12amp;a<em>13a</em>14 a<em>21a</em>22amp;a<em>23a</em>24 a<em>31a</em>32amp;a<em>33a</em>34 0amp;0amp;0amp;1]
- Hololens Coordinate System (Zw, Yw)
- World Coordinate System (yw)
Spatial Mapping
- Definition: the process of a mixed reality device mapping the real space, for the device to create an understanding of it.
- A mesh is created that lays over the real environment. A mesh looks like a series of triangles placed together, like a fishing net.
- This is done through computational geometry and computer vision (visual SLAM).
Spatial Mapping Usage
- Visualisation and navigation to position and display the virtual object correctly and grant the virtual object/agent/character the ability to navigate around
- Physics and occlusion to perform physics simulation, e.g. the virtual object can bounce across the floor
Mapping Recognition
- The process of mapping, registration, and recognition of non-static elements of the real world, which allows one to communicate between the real world and virtual objects
- the user's hands are recognised and interpreted as left and right- hand skeletal models
- five colliders are attached to the five fingertips of each hand skeletal model
- Microsoft HoloLens
Collider Details
- The collider is a sphere collider, which can be visually rendered to provide better cues for near targeting.
- The sphere's diameter should match the thickness of the index finger to increase touch accuracy.
Interaction Models
- Direct interaction, where 10 collidable fingertips are used can cause unexpected and unpredictable collisions.
- 3D object manipulation using a bounding box.
- Bounding box provides better depth through its proximity shader.
- Gaze and head interactions (eye and head tracking).
- Voice-based interaction
- Microsoft HoloLens
References
- Rokhsaritalemi, Somaiieh, Abolghasem Sadeghi-Niaraki, and Soo-Mi Choi. "A review on mixed reality: Current trends, challenges and prospects." Applied Sciences 10.2 (2020): 636.
- Speicher, Maximilian, Brian D. Hall, and Michael Nebeling. "What is mixed reality?." Proceedings of the 2019 CHI conference on human factors in computing systems. 2019.
- Kruijff, Ernst, J. Edward Swan, and Steven Feiner. "Perceptual issues in augmented reality revisited." 2010 IEEE International Symposium on Mixed and Augmented Reality. IEEE, 2010.