Virtual Machines and Virtualization Technologies

Introduction to Virtual Machines and Abstraction Layers

  • Definition of Virtualization: Managing computing resources by providing a software translation layer, known as an abstraction layer, between the software and the physical hardware.
  • Resource Transformation: Virtualization turns physical resources into logical, or virtual, resources.
  • User Transparency: It allows the use of resources without requiring the user or software to be aware of the physical details of the underlying resources.
  • Fundamental Idea: Abstract the hardware of a single computer into several different execution environments.
  • Layered Approach: Similar to a layered approach, but the layer creates a virtual system (virtual machine, or VM) on which operating systems or applications can run.

Principal Components of Virtual Machine Systems

  • Host System: The underlying physical hardware system.
  • Virtual Machine Manager (VMM) or Hypervisor:     * The software that creates and runs virtual machines.     * Provides an interface to the guest that is identical to the host.     * Exception: In the case of paravirtualization, the interface is not identical.
  • Guest System: The process provided with a virtual copy of the host.     * This is usually an operating system.     * Example: A guest Windows OS running in a VM on top of a Linux host OS.
  • Virtual Machine Stack Structure:     * Non-virtual machine: Hardware $\rightarrow$ Operating System $\rightarrow$ Applications.     * Virtual machine: Hardware Platform $\rightarrow$ Virtualizing Software $\rightarrow$ Guest OS $\rightarrow$ Libraries $\rightarrow$ Applications.

Organizational Reasons for Using Virtualization

  • Legacy Hardware: Applications built for legacy hardware can still be run without maintaining the original hardware.
  • Rapid Deployment: A new VM may be deployed in a matter of minutes.
  • Versatility: Maximizing the number of kinds of applications that a single computer can handle.
  • Consolidation: Sharing the resource among multiple applications simultaneously.
  • Aggregating: Makes it easy to combine multiple resources into one virtual resource.
  • Dynamics: Hardware resources can be easily allocated in a dynamic fashion.
  • Ease of Management: Facilitates the deployment and testing of software.
  • Increased Availability: In case of a physical server failure, VMs on the failed host can be quickly and automatically restarted on another host.

Hypervisor Characteristics and Functions

  • Software Construct: A virtual machine is a software construct that mimics the characteristics of a physical server.
  • Configuration: VMs are configured with a specific number of processors, amount of RAM, storage resources, and network port connectivity.
  • Operation: Once created, a VM can be powered on like a physical server.
  • Resource Proxy: The hypervisor acts as a proxy for the guests (VMs) as they request and consume physical host resources.
  • Core Hypervisor Functions:     * Execution management of VMs.     * Device emulation and access control.     * Execution of privileged operations by the hypervisor for guest VMs.     * Management of VMs (also known as VM lifecycle management).     * Administration of the hypervisor platform and hypervisor software.

Comparison of Type 1 and Type 2 Hypervisors

  • Type 1 Hypervisor (Bare Metal):     * Loaded as a software layer directly onto a physical server, similar to how an OS is loaded.     * Directly controls the physical resources of the host.     * Supports the movement of VMs to new hosts without service interruption.     * Examples: VMware ESXi (vSphere), Microsoft Hyper-V, Oracle VM Server, and KVM.
  • Type 2 Hypervisor (Hosted):     * Exploits the resources and functions of a host OS and runs as a software module on top of that OS.     * Relies on the host OS to handle all hardware interactions on its behalf.     * Examples: VMware Workstation, VMware Fusion (MacOS), and Oracle VM VirtualBox.
  • Performance and Security Differences:     * Performance: Type 1 hypervisors generally perform better because they do not compete for resources with a host OS.     * Capacity: More VMs can typically be hosted on a Type 1 hypervisor.     * Security: Type 1 is considered more secure because resource requests are handled external to the guest, preventing one VM from affecting others.     * Flexibility (Type 2): Type 2 allows users to use virtualization without dedicating a server solely to that function. It is ideal for developers needing multiple environments alongside a personal PC workspace.     * Risk (Type 2): A malicious guest on a Type 2 hypervisor could potentially affect more than just itself.

Benefits and Advanced Features of Virtualization

  • Protection: Host systems are protected from VMs, and VMs are protected from each other (e.g., viruses are less likely to spread).
  • Sharing: Provided through shared file system volumes and network communication.
  • State Management:     * Freeze/Suspend: A running VM can be frozen, moved or copied, and resumed elsewhere.     * Snapshot: Captures a given state, allowing the user to restore back to that exact state. Some VMMs allow multiple snapshots per VM.     * Cloning: Creating a copy and running both the original and the copy.
  • Research and Development: Beneficial for OS research and system development efficiency by running multiple different OSes on one machine.
  • Templating: Creating an OS + application VM to provide to customers or use for creating multiple identical instances.
  • Live Migration: Moving a running VM from one host to another with no interruption of user access.
  • Cloud Computing: The combination of these features allows programs to use APIs to tell infrastructure to create new guests, VMs, and virtual desktops.

Paravirtualization and Hardware-Assisted Virtualization

  • Paravirtualization:     * A software-assisted technique using specialized APIs to link VMs with the hypervisor for optimized performance.     * The guest OS (e.g., Linux or Windows) must have specialized paravirtualization support in the kernel.     * Specific paravirtualization drivers allow the OS and hypervisor to work together, reducing the overhead of hypervisor translations.
  • Hardware-Assisted Virtualization:     * AMD and Intel added functionality to processors to enhance hypervisor performance.     * Extensions include AMD-V and Intel VT-x.     * Intel processors offer Virtual Machine Extensions (VMX), an extra instruction set.     * Benefits: Codebases can be smaller and more efficient, and operations occur much faster directly on the processor.

Virtual Appliances

  • Definition: Standalone software distributed as a virtual machine image.
  • Composition: A packaged set of applications and a guest OS.
  • Independence: Independent of hypervisor or processor architecture; can run on Type 1 or Type 2.
  • Ease of Use: Deploying a pre-installed/pre-configured appliance is easier than manual installation and configuration.
  • Software Distribution: Considered a de-facto means of software distribution.
  • Security Virtual Appliance (SVA): Specialized appliance for monitoring and protecting other VMs.

Container Virtualization

  • Approach: Software known as a virtualization container runs on top of the host OS kernel to provide isolated execution environments.
  • Differences from VMs: Containers do not emulate physical servers; all containerized applications on a host share a common OS kernel.
  • Overhead: Eliminates the need for a separate guest OS for every application, greatly reducing overhead.
  • Linux Container Phases:     * Setup: Creating the environment to create and start containers.     * Configuration: Configuring containers to run specific applications or commands.     * Management: Seamlessly bootstrapping (starting) and shutting down the container.
  • Characteristics: No need for a guest OS; management software simplifies creation.

Kernel Control Groups (Cgroups) and Data Flow

  • Cgroups Provisions:     * Resource Limiting: Groups cannot exceed configured memory limits.     * Prioritization: Specific groups receive larger shares of CPU utilization or disk I/O throughput.     * Accounting: Measures resource usage for purposes like billing.     * Control: Freezing groups of processes, checkpointing, and restarting them.
  • I/O Data Flow Comparison:     * Hypervisor: App $\rightarrow$ Guest OS device driver $\rightarrow$ Virtual I/O device $\rightarrow$ Hypervisor interception $\rightarrow$ Physical device driver $\rightarrow$ Physical I/O device.     * Container: App $\rightarrow$ Indirection through kernel control groups $\rightarrow$ Physical device driver $\rightarrow$ Physical I/O device.

Disadvantages and File Systems of Containers

  • Disadvantages:     * Portability: Applications are only portable across systems supporting the same OS kernel and virtualization features (typically Linux).     * Kernel Limitations: If a VM requires a unique kernel setup, containers cannot provide this (VMs use a guest OS for this).     * Security: Containerization sits between the OS and applications; it has lower overhead but introduces greater security vulnerabilities compared to hardware-level VM isolation.
  • Container File System:     * Each container maintains its own isolated file system.     * At the disk level, a container is a file and is easily scalable.     * Virus Checking: The file system is mounted under a special mount point on the hardware node so system tools can safely check every file.

Microservices and Docker

  • Microservices: Smaller deployable units that enable quicker updates and precise scalability.
  • Docker Overview: Provides a simple, standardized way to run containers; more popular than competitors like LXC due to quick image loading.
  • Storage: Docker containers are stored in the cloud as images.
  • Principal Docker Components:     * Docker Image: Read-only templates for instantiating containers.     * Docker Client: Requests an image to create a new container.     * Docker Host: Platform with its own host OS executing containerized apps.     * Docker Engine: Lightweight runtime package that builds and runs containers.     * Docker Machine: Installs the engine on a host and configures the client to communicate with the engine.     * Docker Registry: Stores Docker images.     * Docker Hub: A collaboration platform and public repository for sharing and contributing custom images.

Resource Management Issues

  • Processor Strategies:     1. Software Emulation: Emulating a chip as software. Is platform-independent and transportable but resource-intensive and inefficient.     2. Time-Segmenting: Providing segments of processing time on physical processors (pCPUspCPUs) to the virtual processors of VMs.
  • Memory Management:     * Focuses on managing the physical resource rather than creating a virtual entity.     * VMs are usually configured with fewer resources than the host (e.g., Physical server with 8GB8\,GB RAM might have a VM with 1GB1\,GB RAM).     * Techniques: Translation tables, Page sharing, Ballooning, and Memory overcommit.
  • I/O Management:     * OS in the VM calls the device driver as if it were a physical server.     * The driver connects with the emulated device managed by the hypervisor, which then communicates with the physical NIC driver and NIC.