Chapter 1 Notes: Modern Infrastructure and Applications with Docker

Evolution of application architecture

Software architectures continually evolve as hardware and software capabilities improve; tech gaps drive new designs.
- Example: faster network speeds enabled distributing components across servers and even across data centers in multiple countries.
Early computing models to frame modern shifts:
- Base: mainframe days (before the $1990$ s) -> unitary architecture: one big computer, terminals accessed by users.
- Client-server model became popular as users gained more client-side functionality, shifting some load away from servers.
- Both models considered monoliths when all important components ran on one server; even decoupled databases didn’t escape monolithic thinking.
Problems with monoliths: hard to upgrade, scale, or maintain; availability issues demand duplicated hardware and quorum resources.
Evolution toward distributed components and virtualization laid groundwork for later containerization and cloud-native approaches.
Key ideas driving modern infrastructure:
- Move toward microservices: small, well-defined components with specific functionalities that can be independently developed, tested, and deployed.
- Ability to run components in different languages best suited to each task, distributing load across the system.
- Rise of kernel-based isolation features that enable containers to run as lightweight processes with strong isolation.
Central trends discussed in this chapter:
- Evolution from monoliths to distributed microservice architectures
- Developing microservice-based applications
- How containers fit in the microservices model
- Main concepts, features, and components of software containers
- Virtualization vs. containers
- Building, sharing, and running containers
- Windows containers
- Security improvements using software containers
Technical context: using open source tools (with some licensed-for-non-professional use) and labs available at GitHub. Lab resources: https://github.com/PacktPublishing/Containers-for-Developers-Handbook/tree/main/Chapter1. The Code In Action video: https://packt.link/JdOIY.

Three-tier architecture

Three-tier software architecture decouples an application into $3$ logical/physical layers: presentation tier, application tier, and data tier.
- Presentation tier: user interface.
- Application tier: backend processing of data.
- Data tier: storage and management of data (e.g., databases).
Benefits of three-tier separation:
- Facilitates distributed components across virtual servers rather than expanding a single server.
- Different roles required for maintenance (e.g., database administrators, middleware admins, infrastructure admins).
- Frontend vs backend focus for developers; languages tailored to layers (e.g., JavaScript for frontend).
Shared files and decoupling history:
- Early use of network file sharing (NAS) and storage backends (SAN).
- SOAP and other messaging/queue technologies helped decouple data distribution without filesystem reliance.
Relationship to virtualization: three-tier architecture pairs well with virtualization to distribute components across multiple virtual servers while maintaining clear boundaries.

Microservices architecture

Microservices take decoupling further by splitting into even smaller, standalone components with independent lifecycles.
Core characteristics:
- Each microservice is a lightweight, independently deployable component that can run in its own process.
- Freedom to choose the best programming language per component.
- Statelessness is preferred to enable distribution and replication; application state should be abstracted from logic.
- Fast restarts improve resilience and reduce outage windows; changes can be applied quickly.
- Circuit breakers and fast recovery help maintain availability even when dependencies fail.
Interaction patterns:
- RESTful communication over HTTP for component interaction.
- Standardized APIs describe methods, actions, and data provided by each microservice.
Distribution and placement:
- Services are distributed across nodes for performance, proximity to data sources, and security concerns.
- Nodes can be customized with features tailored to the microservices they host.
Impact on development:
- Move from monoliths to distributed development; teams can own specific services.
- Facilitates “run everywhere” goals: components should operate across cloud, on-premises, or mixed environments.
Consequences for developers:
- Must design services to be decoupled, independently testable, and independently deployable.
- Emphasizes resilience and quick recovery; health of a service should not depend on external infrastructure.

Developing distributed applications

Transition from monoliths to distributed architectures changes the development workflow:
- Monoliths required significant hardware and node parity across environments; replication of environments was complex.
- Automation and virtualization enabled faster provisioning of development nodes and alignment across environments.
Evolution of provisioning and lifecycle:
- Earlier provisioning was slow (months) due to specs, budgets, and procurement; virtualization accelerated this process.
- IaC (Infrastructure as Code) allowed programmatic provisioning of resources; platforms like OpenStack emerged to provide on-premises cloud infrastructure.
- Cloud providers introduced APIs for automation; IaC became central to scalable, repeatable deployments.
Economic considerations:
- Cloud cost management became a priority as usage grows; network bandwidth and unregulated resource consumption increased costs.
- Elasticity and easy provisioning are key goals of early open-source cloud-like efforts.
From virtualization to containers:
- Virtualization helped provisioning and three-tier architectures but required full OS instances per VM, consuming resources.
- The application lifecycle expanded from few updates per year to dozens per day; developers began relying on cloud services and automation.
- Containers emerged as the evolution of process isolation that minimizes OS overhead while preserving isolation.

What are containers?

Containers are processes with isolation built on top of a shared host kernel using kernel features like control groups (cgroups) and namespaces.
Evolution through key projects:
- LXC (Linux Containers) introduced the concept of containers using kernel namespaces.
- Docker Inc. popularized containers, making them easy to run and share.
Core idea:
- A container is a process with its own isolated environment and resources, sharing the host kernel, and using cgroups and namespaces to provide isolation and resource control.
Basic concepts explained:
- A container runs as a process on the host, with its own process hierarchy and a main process (PID 1) inside the container.
- Processes inside a container are visible in their own namespace; outside, PIDs are separate.
- The container dies if its main process dies or is stopped.
Context for modern development:
- Containers align with microservices by running defined, isolated components with minimal OS overhead.
- Containers can eschew running a full OS per component, focusing on the necessary set of libraries and binaries.

Understanding the main concepts of containers

Kernel process isolation essentials:
- Kernel namespaces isolate processes by properties like processes, network, users, IPC, mounts, and UTS (host name and time).
- Each container gets its own network namespace, users, IPC, mounts, and UTS, enabling isolated operation.
- The container's main process is PID 1 inside its namespace; other processes inherit the hierarchy.
- The host and container PID namespaces are separate, ensuring isolation of processes.
Diagrammatic intuition (PID namespace):
- Inside the container: main process with PID 1 and its child processes.
- Outside the container: host PID 1 with its own process tree.
Control groups (cgroups):
- Linux kernel feature to limit, isolate, and account for resources (CPU, memory, disk I/O) used by processes.
- Enables resource limits, prioritization, accounting, and control of process groups.
- Prevents container-induced resource exhaustion from bringing down the host.
- Useful even outside containers (V1 and V2 cgroups).
Container runtime concept:
- A container runtime (engine) runs containers on a host: downloading images, monitoring resource usage, and managing isolation layers.
- Runtimes can be low-level (start/stop containers, manage images) or high-level (provide OCI CRI compatibility for orchestration).
- Examples: runC (low-level, OCI standard), crun (faster, smaller footprint); Docker, CRI-O, Windows/Hyper-V containers (high-level).
- Kubernetes integration relies on an OCI-compatible runtime via CRI; runtimes like containerd provide a middle ground.
- Advanced isolation approaches include sandboxes (gVisor) and virtualized runtimes (Kata Containers).
Docker and container ecosystem:
- Docker popularized containers; Docker Engine historically served as the runtime, later deprecated in Kubernetes 1.22 due to containerd integration.
- Docker Desktop and Rancher Desktop provide a client-server model with a local runtime; dockerd or containerd serves as the runtime.
- Rootless modes (rootless Docker, Podman) enable non-privileged container execution for security.
Important note on scope:
- Chapter primarily discusses containers in Linux; Windows containers are discussed briefly with Windows-specific considerations.

Kernel features that enable containers

Kernel namespaces provide isolation across several dimensions:
- Processes namespace: unique PID space; inside container, PID 1 is the main process.
- Network namespace: containers get their own network stack and IPs; host bridges connect container networks.
- Users namespace: container users are mapped to host user IDs, providing isolation.
- IPC namespace: separate shared memory, semaphores, and message queues.
- Mounts namespace: containers mount their own root filesystem and can attach host/local or remote mounts.
- UTS namespace: container hostname and time synchronization with host.
The combination of namespaces and cgroups provides robust containment for container processes.

Container images and the layering model

Container images are templates that include files (binaries, libraries, configurations) required by processes.
Immutability: base layers are read-only; modifications are stored in a read-write container layer on top.
Layering and sharing:
- Images are built from multiple layers; layers are shared across images to reduce duplication.
- If two images share a common base, that base layer is shared to save disk space.
Packaging and distribution:
- Layers are packaged into .tar files for sharing; images contain the layers plus metadata describing usage (ENTRYPOINT, EXPOSE, USER, etc.).
Dockerfiles: recipes used to reproduce image builds; workflow to build artifacts consistently across environments.
Registries: centralized stores for image layers and metadata to enable sharing and distribution.
Overlay filesystems: underlying mechanism enabling layers to be merged; upper layers override lower layers.
Image composition implications:
- Multiple containers can run from the same image; container layer (read-write) holds ephemeral changes that disappear when the container is removed.
- Persistent data should be stored outside the container’s writable layer to avoid data loss on container removal.
Practical guidance:
- Avoid placing frequently changing data (logs, runtime state) in the container write layer.
- Use external persistence mechanisms and proper data management strategies (learn more in Chapter 10).

Overlay filesystems

Overlay filesystem concept:
- A union mount that presents multiple directories as a single coherent filesystem.
- Upper (writable) layer overrides content from lower layers to reflect changes.
Benefits for containers:
- Enables fast, space-efficient sharing of common base layers while allowing container-specific changes.
- Supports efficient layering and reuse of image content across containers.

Dynamism in container-based applications

Networking dynamics:
- Containers share a network namespace; each new container typically receives a new IP address from IPAM (IP Address Management).
- IPs are dynamic; avoid hard-coding container IP addresses in configurations.
- Service discovery is achieved via DNS or service names rather than fixed IPs.
Addressing and publishing:
- Service names and DNS allow stable addressing in dynamic environments.
- NAT and port mappings handle external access to containerized services.
Persistent data considerations:
- Avoid relying on container write layers for persistent data; plan for persistent storage outside container layers.
Tooling context:
- Tools for creating, running, and sharing containers include Docker, Kubernetes, and other orchestration technologies discussed later.

Tools for managing containers

Container runtimes:
- Run containers on a host; manage their lifecycle, images, and metrics.
Docker as a comprehensive toolset:
- Docker provides a client-server model; Docker Engine (daemon) and Docker CLI used together.
- Docker Desktop integrates a container runtime and a minimal Kubernetes, suitable for development.
Alternatives and ecosystem:
- Podman: rootless, daemon-less container engine; emphasizes security by running without a background daemon.
- Nerdctl: CLI for containerd-based runtimes; used in conjunction with containerd.
Desktop environments:
- Docker Desktop and Rancher Desktop enable container runtimes on desktops with a managed UI and CLI.
RBAC and security implications:
- Runtimes themselves do not provide RBAC; orchestration platforms like Kubernetes provide RBAC for access control.
- Desktop environments can operate with embedded orchestration (e.g., Kubernetes in Docker Desktop).

Windows containers

Windows container models:
- Hyper-V Linux Containers (old model): runs Linux inside a Hyper-V VM on Windows.
- Windows Server Containers (Windows Process Containers): native Windows containers on Windows hosts.
Management parity:
- Management and execution are similar across models; isolation differs (process-level vs VM-level).
Image size considerations:
- Windows images tend to be larger due to DLLs and Windows-specific resources.
Adoption and ecosystem:
- Windows Server containers were popular during early Windows container days but declined as Linux container support matured.
- Kubernetes now supports Windows Server hosts as workers, enabling Windows containers in orchestrated clusters; still less common than Linux-focused deployments.

Security improvements using software containers

Defense-in-depth approach:
- Run containers on hosts with minimal attack surfaces; dedicated hosts for container workloads improve security.
- Consider minimal OS images (CoreOS, RancherOS, PhotonOS, Talos, Flatcar) or build minimal custom OS images (LinuxKit).
Runtime security practices:
- Restrict access to container runtimes; sockets should be protected and TLS should be used for remote access.
  -RBAC is not provided by all runtimes; add RBAC via orchestration tools where appropriate (Kubernetes provides RBAC).
Host and container security integration:
- Security modules like SELinux or AppArmor help constrain container access; non-default labels may be required for certain operations.
Practical security notes for developers:
- Containers typically run as root inside the container; ensure appropriate security measures to limit host access.
- Prepare security configurations for special requirements, such as host log access or kernel module loading.

Comparing virtualization and containers

Visual comparison: virtual guest nodes vs. containers on a host:
- Virtualization: physical host runs a hypervisor, with each guest VM running its own OS and kernel; this is heavier in resources and slower to scale.
- Containers: share host kernel; run as processes with isolated namespaces and cgroups; lighter and faster to start.
Pros/cons in microservice contexts:
- Virtual machines provide strong isolation but higher overhead and slower scaling; containers offer elasticity and speed ideal for microservices.
- Containers decouple from the underlying OS; VMs are tied to their own guest OS instances.
Elasticity and scale concerns:
- Containers better fit elastic environments where components scale up/down on demand.
Modern orchestration needs:
- Orchestrators manage resources, persistence, networking, and communication for containers across clusters.

Building, sharing, and running containers

Build-ship-run paradigm (Docker’s slogan):
- Build container images, share via registries, run containers consistently across environments.
- CI/CD pipelines build development images, test integrations, and push release images through registries.
Reproducibility and workflow:
- Dockerfiles drive reproducible builds; the same recipe used locally and in CI/CD pipelines.
- Layered images facilitate sharing and reduce transfer size since only new layers are downloaded when updating.
Artifact consistency and environments:
- Images ensure consistent artifacts across development, testing, staging, and production.
Tooling and integration:
- Docker CLI remains a common interface; orchestration platforms (Kubernetes, Swarm) manage multi-container deployments at scale.

Explaining Windows containers (summary)

Windows container models and coexistence with Linux containers:
- Windows Server containers vs Hyper-V Linux containers serve different isolation and compatibility needs.
- Docker tooling supports both, with client-server models and remote runtimes.
Kubernetes support evolution:
- Initial Windows container support in Kubernetes lagged behind Linux; now Windows Server containers can run on Kubernetes worker nodes, though Linux prevalence remains higher.
Practical takeaway:
- Windows containers are useful for workloads tightly integrated with Windows ecosystems (e.g., .NET on Windows), but Linux containers dominate most container-native workflows.

Improving security using software containers (practical security practices)

Security-by-default posture:
- Run on minimal hosts with limited services and reduced attack surfaces.
- Use TLS to secure socket connections to container runtimes.
Access control and RBAC:
- Runtime environments may lack RBAC; rely on orchestration platforms like Kubernetes for granular access control.
Runtime security configurations:
- Runtimes apply a default Seccomp profile to constrain system calls inside containers.
- Additional capabilities can be requested via container configurations; avoid privileged containers unless absolutely necessary.
Host-level security controls:
- Use Linux Security Modules (LSMs) such as SELinux or AppArmor to restrict container access to host resources.
- Ensure that container runtimes are integrated with LSMs and that containers adhere to restricted SELinux labels when required.
Operational guidance:
- Keep container hosts secure and minimal; isolate workloads to separate hosts when possible.
- Plan for specialized security configurations for persistent storage or host log access.

Labs, Windows, and practical installation notes (Chapter-specific labs and demos)

Labs overview:
- Chapter 1 includes hands-on labs to install and run a development container environment (Docker Desktop with a minimal Kubernetes setup).
- Labs focus on quick, practical setup to validate container runtimes and basic workflows.
Docker Desktop lab setup notes:
- Supports Windows 10 (and other platforms) with Hyper-V or WSL 2 backends; WSL2 is recommended for better performance.
- Requires enabling WSL2 on Windows and installing a Linux distribution (Ubuntu via Microsoft Store is common).
- After installation, you verify the runtime by running docker commands, e.g., docker info and docker run -ti alpine to inspect a container and its filesystem.
Windows-specific notes in labs:
- Windows containers can be managed via the same client-server model; Windows Server containers and Hyper-V Linux containers offer different isolation strategies.
- Kubernetes support for Windows clusters exists but is more commonly used with Linux-based workloads.
Practical steps highlighted in the labs:
- Install Docker Desktop, enable WSL2 integration, and verify the Docker client-server integration.
- Run a small container (e.g., Alpine) to inspect process trees and root filesystem.
- Inspect images and containers via the Docker Desktop UI to understand layering and lifecycle.

Summary (chapter takeaway)

Containers provide resilience, high availability, scalability, and portability by isolating processes with kernel features rather than virtualizing hardware.
Containers fit naturally into modern microservices architectures by enabling independent, language-agnostic components with fast startup and scalable lifecycles.
Core concepts introduced include:
- Kernel namespaces and cgroups for isolation and resource control; container main processes run as PID 1
  inside their own namespace.
- Image layering, immutability, and overlay filesystems that enable efficient sharing and reproducibility.
- Container runtimes (low-level and high-level) and their role in executing and orchestrating containers.
- Container images, registries, and Dockerfiles as reproducible templates for environments.
- Container orchestration (Kubernetes, Docker Swarm) to manage containers across clusters with scheduling, persistence, and networking.
- Windows containers and their evolution, with a focus on how Linux containment concepts transfer to Windows.
- Security by design: minimal hosts, Seccomp profiles, RBAC integration via orchestration, and LSMs like SELinux/AppArmor.
Roadmap implied by the chapter:
- Next: deep dive into building container images (Chapter 2), followed by running containers and orchestration topics in later chapters.

Key figures and terms to remember

IPAM: $IPAM$ (IP Address Management) used to dynamically assign IPs to containers.
REST: Representational State Transfer; common API design for microservice communication.
NAT: Network Address Translation; used for exposing services to external clients.
Overlay filesystem: $overlay ilesystem$ used to compose image layers.
cgroups: $cgroups$ – Linux kernel feature for resource control.
namespaces: $namespaces$ – Linux kernel feature for process isolation (PID, network, IPC, mounts, UX).
Seccomp: $Seccomp$ – Linux security facility for restricting system calls inside containers.
RBAC: $RBAC$ – Role-Based Access Control (provided by orchestration platforms).
Dockerfile: recipe to reproducibly build container images.
Kubernetes, Swarm: container orchestration platforms; Kubernetes is the most widely used in enterprise contexts.
OpenShift, Rancher, Tanzu: examples of enterprise Kubernetes platforms and distributions.
OpenStack: early open-source cloud infrastructure platform enabling on-premises cloud-like capabilities.