Virtual Machines and Virtualization Technologies
Introduction to Virtual Machines and Abstraction Layers
- Definition of Virtualization: Managing computing resources by providing a software translation layer, known as an abstraction layer, between the software and the physical hardware.
- Resource Transformation: Virtualization turns physical resources into logical, or virtual, resources.
- User Transparency: It allows the use of resources without requiring the user or software to be aware of the physical details of the underlying resources.
- Fundamental Idea: Abstract the hardware of a single computer into several different execution environments.
- Layered Approach: Similar to a layered approach, but the layer creates a virtual system (virtual machine, or VM) on which operating systems or applications can run.
Principal Components of Virtual Machine Systems
- Host System: The underlying physical hardware system.
- Virtual Machine Manager (VMM) or Hypervisor:
* The software that creates and runs virtual machines.
* Provides an interface to the guest that is identical to the host.
* Exception: In the case of paravirtualization, the interface is not identical.
- Guest System: The process provided with a virtual copy of the host.
* This is usually an operating system.
* Example: A guest Windows OS running in a VM on top of a Linux host OS.
- Virtual Machine Stack Structure:
* Non-virtual machine: Hardware $\rightarrow$ Operating System $\rightarrow$ Applications.
* Virtual machine: Hardware Platform $\rightarrow$ Virtualizing Software $\rightarrow$ Guest OS $\rightarrow$ Libraries $\rightarrow$ Applications.
Organizational Reasons for Using Virtualization
- Legacy Hardware: Applications built for legacy hardware can still be run without maintaining the original hardware.
- Rapid Deployment: A new VM may be deployed in a matter of minutes.
- Versatility: Maximizing the number of kinds of applications that a single computer can handle.
- Consolidation: Sharing the resource among multiple applications simultaneously.
- Aggregating: Makes it easy to combine multiple resources into one virtual resource.
- Dynamics: Hardware resources can be easily allocated in a dynamic fashion.
- Ease of Management: Facilitates the deployment and testing of software.
- Increased Availability: In case of a physical server failure, VMs on the failed host can be quickly and automatically restarted on another host.
Hypervisor Characteristics and Functions
- Software Construct: A virtual machine is a software construct that mimics the characteristics of a physical server.
- Configuration: VMs are configured with a specific number of processors, amount of RAM, storage resources, and network port connectivity.
- Operation: Once created, a VM can be powered on like a physical server.
- Resource Proxy: The hypervisor acts as a proxy for the guests (VMs) as they request and consume physical host resources.
- Core Hypervisor Functions:
* Execution management of VMs.
* Device emulation and access control.
* Execution of privileged operations by the hypervisor for guest VMs.
* Management of VMs (also known as VM lifecycle management).
* Administration of the hypervisor platform and hypervisor software.
Comparison of Type 1 and Type 2 Hypervisors
- Type 1 Hypervisor (Bare Metal):
* Loaded as a software layer directly onto a physical server, similar to how an OS is loaded.
* Directly controls the physical resources of the host.
* Supports the movement of VMs to new hosts without service interruption.
* Examples: VMware ESXi (vSphere), Microsoft Hyper-V, Oracle VM Server, and KVM.
- Type 2 Hypervisor (Hosted):
* Exploits the resources and functions of a host OS and runs as a software module on top of that OS.
* Relies on the host OS to handle all hardware interactions on its behalf.
* Examples: VMware Workstation, VMware Fusion (MacOS), and Oracle VM VirtualBox.
- Performance and Security Differences:
* Performance: Type 1 hypervisors generally perform better because they do not compete for resources with a host OS.
* Capacity: More VMs can typically be hosted on a Type 1 hypervisor.
* Security: Type 1 is considered more secure because resource requests are handled external to the guest, preventing one VM from affecting others.
* Flexibility (Type 2): Type 2 allows users to use virtualization without dedicating a server solely to that function. It is ideal for developers needing multiple environments alongside a personal PC workspace.
* Risk (Type 2): A malicious guest on a Type 2 hypervisor could potentially affect more than just itself.
Benefits and Advanced Features of Virtualization
- Protection: Host systems are protected from VMs, and VMs are protected from each other (e.g., viruses are less likely to spread).
- Sharing: Provided through shared file system volumes and network communication.
- State Management:
* Freeze/Suspend: A running VM can be frozen, moved or copied, and resumed elsewhere.
* Snapshot: Captures a given state, allowing the user to restore back to that exact state. Some VMMs allow multiple snapshots per VM.
* Cloning: Creating a copy and running both the original and the copy.
- Research and Development: Beneficial for OS research and system development efficiency by running multiple different OSes on one machine.
- Templating: Creating an OS + application VM to provide to customers or use for creating multiple identical instances.
- Live Migration: Moving a running VM from one host to another with no interruption of user access.
- Cloud Computing: The combination of these features allows programs to use APIs to tell infrastructure to create new guests, VMs, and virtual desktops.
Paravirtualization and Hardware-Assisted Virtualization
- Paravirtualization:
* A software-assisted technique using specialized APIs to link VMs with the hypervisor for optimized performance.
* The guest OS (e.g., Linux or Windows) must have specialized paravirtualization support in the kernel.
* Specific paravirtualization drivers allow the OS and hypervisor to work together, reducing the overhead of hypervisor translations.
- Hardware-Assisted Virtualization:
* AMD and Intel added functionality to processors to enhance hypervisor performance.
* Extensions include AMD-V and Intel VT-x.
* Intel processors offer Virtual Machine Extensions (VMX), an extra instruction set.
* Benefits: Codebases can be smaller and more efficient, and operations occur much faster directly on the processor.
Virtual Appliances
- Definition: Standalone software distributed as a virtual machine image.
- Composition: A packaged set of applications and a guest OS.
- Independence: Independent of hypervisor or processor architecture; can run on Type 1 or Type 2.
- Ease of Use: Deploying a pre-installed/pre-configured appliance is easier than manual installation and configuration.
- Software Distribution: Considered a de-facto means of software distribution.
- Security Virtual Appliance (SVA): Specialized appliance for monitoring and protecting other VMs.
Container Virtualization
- Approach: Software known as a virtualization container runs on top of the host OS kernel to provide isolated execution environments.
- Differences from VMs: Containers do not emulate physical servers; all containerized applications on a host share a common OS kernel.
- Overhead: Eliminates the need for a separate guest OS for every application, greatly reducing overhead.
- Linux Container Phases:
* Setup: Creating the environment to create and start containers.
* Configuration: Configuring containers to run specific applications or commands.
* Management: Seamlessly bootstrapping (starting) and shutting down the container.
- Characteristics: No need for a guest OS; management software simplifies creation.
Kernel Control Groups (Cgroups) and Data Flow
- Cgroups Provisions:
* Resource Limiting: Groups cannot exceed configured memory limits.
* Prioritization: Specific groups receive larger shares of CPU utilization or disk I/O throughput.
* Accounting: Measures resource usage for purposes like billing.
* Control: Freezing groups of processes, checkpointing, and restarting them.
- I/O Data Flow Comparison:
* Hypervisor: App $\rightarrow$ Guest OS device driver $\rightarrow$ Virtual I/O device $\rightarrow$ Hypervisor interception $\rightarrow$ Physical device driver $\rightarrow$ Physical I/O device.
* Container: App $\rightarrow$ Indirection through kernel control groups $\rightarrow$ Physical device driver $\rightarrow$ Physical I/O device.
Disadvantages and File Systems of Containers
- Disadvantages:
* Portability: Applications are only portable across systems supporting the same OS kernel and virtualization features (typically Linux).
* Kernel Limitations: If a VM requires a unique kernel setup, containers cannot provide this (VMs use a guest OS for this).
* Security: Containerization sits between the OS and applications; it has lower overhead but introduces greater security vulnerabilities compared to hardware-level VM isolation.
- Container File System:
* Each container maintains its own isolated file system.
* At the disk level, a container is a file and is easily scalable.
* Virus Checking: The file system is mounted under a special mount point on the hardware node so system tools can safely check every file.
Microservices and Docker
- Microservices: Smaller deployable units that enable quicker updates and precise scalability.
- Docker Overview: Provides a simple, standardized way to run containers; more popular than competitors like LXC due to quick image loading.
- Storage: Docker containers are stored in the cloud as images.
- Principal Docker Components:
* Docker Image: Read-only templates for instantiating containers.
* Docker Client: Requests an image to create a new container.
* Docker Host: Platform with its own host OS executing containerized apps.
* Docker Engine: Lightweight runtime package that builds and runs containers.
* Docker Machine: Installs the engine on a host and configures the client to communicate with the engine.
* Docker Registry: Stores Docker images.
* Docker Hub: A collaboration platform and public repository for sharing and contributing custom images.
Resource Management Issues
- Processor Strategies:
1. Software Emulation: Emulating a chip as software. Is platform-independent and transportable but resource-intensive and inefficient.
2. Time-Segmenting: Providing segments of processing time on physical processors (pCPUs) to the virtual processors of VMs.
- Memory Management:
* Focuses on managing the physical resource rather than creating a virtual entity.
* VMs are usually configured with fewer resources than the host (e.g., Physical server with 8GB RAM might have a VM with 1GB RAM).
* Techniques: Translation tables, Page sharing, Ballooning, and Memory overcommit.
- I/O Management:
* OS in the VM calls the device driver as if it were a physical server.
* The driver connects with the emulated device managed by the hypervisor, which then communicates with the physical NIC driver and NIC.