Chapter 1 Notes: Modern Infrastructure and Applications with Docker

Evolution of application architecture

  • Software architectures continually evolve as hardware and software capabilities improve; tech gaps drive new designs.

    • Example: faster network speeds enabled distributing components across servers and even across data centers in multiple countries.

  • Early computing models to frame modern shifts:

    • Base: mainframe days (before the 1990s) -> unitary architecture: one big computer, terminals accessed by users.

    • Client-server model became popular as users gained more client-side functionality, shifting some load away from servers.

    • Both models considered monoliths when all important components ran on one server; even decoupled databases didn’t escape monolithic thinking.

  • Problems with monoliths: hard to upgrade, scale, or maintain; availability issues demand duplicated hardware and quorum resources.

  • Evolution toward distributed components and virtualization laid groundwork for later containerization and cloud-native approaches.

  • Key ideas driving modern infrastructure:

    • Move toward microservices: small, well-defined components with specific functionalities that can be independently developed, tested, and deployed.

    • Ability to run components in different languages best suited to each task, distributing load across the system.

    • Rise of kernel-based isolation features that enable containers to run as lightweight processes with strong isolation.

  • Central trends discussed in this chapter:

    • Evolution from monoliths to distributed microservice architectures

    • Developing microservice-based applications

    • How containers fit in the microservices model

    • Main concepts, features, and components of software containers

    • Virtualization vs. containers

    • Building, sharing, and running containers

    • Windows containers

    • Security improvements using software containers

  • Technical context: using open source tools (with some licensed-for-non-professional use) and labs available at GitHub. Lab resources: https://github.com/PacktPublishing/Containers-for-Developers-Handbook/tree/main/Chapter1. The Code In Action video: https://packt.link/JdOIY.

Three-tier architecture

  • Three-tier software architecture decouples an application into 3 logical/physical layers: presentation tier, application tier, and data tier.

    • Presentation tier: user interface.

    • Application tier: backend processing of data.

    • Data tier: storage and management of data (e.g., databases).

  • Benefits of three-tier separation:

    • Facilitates distributed components across virtual servers rather than expanding a single server.

    • Different roles required for maintenance (e.g., database administrators, middleware admins, infrastructure admins).

    • Frontend vs backend focus for developers; languages tailored to layers (e.g., JavaScript for frontend).

  • Shared files and decoupling history:

    • Early use of network file sharing (NAS) and storage backends (SAN).

    • SOAP and other messaging/queue technologies helped decouple data distribution without filesystem reliance.

  • Relationship to virtualization: three-tier architecture pairs well with virtualization to distribute components across multiple virtual servers while maintaining clear boundaries.

Microservices architecture

  • Microservices take decoupling further by splitting into even smaller, standalone components with independent lifecycles.

  • Core characteristics:

    • Each microservice is a lightweight, independently deployable component that can run in its own process.

    • Freedom to choose the best programming language per component.

    • Statelessness is preferred to enable distribution and replication; application state should be abstracted from logic.

    • Fast restarts improve resilience and reduce outage windows; changes can be applied quickly.

    • Circuit breakers and fast recovery help maintain availability even when dependencies fail.

  • Interaction patterns:

    • RESTful communication over HTTP for component interaction.

    • Standardized APIs describe methods, actions, and data provided by each microservice.

  • Distribution and placement:

    • Services are distributed across nodes for performance, proximity to data sources, and security concerns.

    • Nodes can be customized with features tailored to the microservices they host.

  • Impact on development:

    • Move from monoliths to distributed development; teams can own specific services.

    • Facilitates “run everywhere” goals: components should operate across cloud, on-premises, or mixed environments.

  • Consequences for developers:

    • Must design services to be decoupled, independently testable, and independently deployable.

    • Emphasizes resilience and quick recovery; health of a service should not depend on external infrastructure.

Developing distributed applications

  • Transition from monoliths to distributed architectures changes the development workflow:

    • Monoliths required significant hardware and node parity across environments; replication of environments was complex.

    • Automation and virtualization enabled faster provisioning of development nodes and alignment across environments.

  • Evolution of provisioning and lifecycle:

    • Earlier provisioning was slow (months) due to specs, budgets, and procurement; virtualization accelerated this process.

    • IaC (Infrastructure as Code) allowed programmatic provisioning of resources; platforms like OpenStack emerged to provide on-premises cloud infrastructure.

    • Cloud providers introduced APIs for automation; IaC became central to scalable, repeatable deployments.

  • Economic considerations:

    • Cloud cost management became a priority as usage grows; network bandwidth and unregulated resource consumption increased costs.

    • Elasticity and easy provisioning are key goals of early open-source cloud-like efforts.

  • From virtualization to containers:

    • Virtualization helped provisioning and three-tier architectures but required full OS instances per VM, consuming resources.

    • The application lifecycle expanded from few updates per year to dozens per day; developers began relying on cloud services and automation.

    • Containers emerged as the evolution of process isolation that minimizes OS overhead while preserving isolation.

What are containers?

  • Containers are processes with isolation built on top of a shared host kernel using kernel features like control groups (cgroups) and namespaces.

  • Evolution through key projects:

    • LXC (Linux Containers) introduced the concept of containers using kernel namespaces.

    • Docker Inc. popularized containers, making them easy to run and share.

  • Core idea:

    • A container is a process with its own isolated environment and resources, sharing the host kernel, and using cgroups and namespaces to provide isolation and resource control.

  • Basic concepts explained:

    • A container runs as a process on the host, with its own process hierarchy and a main process (PID 1) inside the container.

    • Processes inside a container are visible in their own namespace; outside, PIDs are separate.

    • The container dies if its main process dies or is stopped.

  • Context for modern development:

    • Containers align with microservices by running defined, isolated components with minimal OS overhead.

    • Containers can eschew running a full OS per component, focusing on the necessary set of libraries and binaries.

Understanding the main concepts of containers

  • Kernel process isolation essentials:

    • Kernel namespaces isolate processes by properties like processes, network, users, IPC, mounts, and UTS (host name and time).

    • Each container gets its own network namespace, users, IPC, mounts, and UTS, enabling isolated operation.

    • The container's main process is PID 1 inside its namespace; other processes inherit the hierarchy.

    • The host and container PID namespaces are separate, ensuring isolation of processes.

  • Diagrammatic intuition (PID namespace):

    • Inside the container: main process with PID 1 and its child processes.

    • Outside the container: host PID 1 with its own process tree.

  • Control groups (cgroups):

    • Linux kernel feature to limit, isolate, and account for resources (CPU, memory, disk I/O) used by processes.

    • Enables resource limits, prioritization, accounting, and control of process groups.

    • Prevents container-induced resource exhaustion from bringing down the host.

    • Useful even outside containers (V1 and V2 cgroups).

  • Container runtime concept:

    • A container runtime (engine) runs containers on a host: downloading images, monitoring resource usage, and managing isolation layers.

    • Runtimes can be low-level (start/stop containers, manage images) or high-level (provide OCI CRI compatibility for orchestration).

    • Examples: runC (low-level, OCI standard), crun (faster, smaller footprint); Docker, CRI-O, Windows/Hyper-V containers (high-level).

    • Kubernetes integration relies on an OCI-compatible runtime via CRI; runtimes like containerd provide a middle ground.

    • Advanced isolation approaches include sandboxes (gVisor) and virtualized runtimes (Kata Containers).

  • Docker and container ecosystem:

    • Docker popularized containers; Docker Engine historically served as the runtime, later deprecated in Kubernetes 1.22 due to containerd integration.

    • Docker Desktop and Rancher Desktop provide a client-server model with a local runtime; dockerd or containerd serves as the runtime.

    • Rootless modes (rootless Docker, Podman) enable non-privileged container execution for security.

  • Important note on scope:

    • Chapter primarily discusses containers in Linux; Windows containers are discussed briefly with Windows-specific considerations.

Kernel features that enable containers

  • Kernel namespaces provide isolation across several dimensions:

    • Processes namespace: unique PID space; inside container, PID 1 is the main process.

    • Network namespace: containers get their own network stack and IPs; host bridges connect container networks.

    • Users namespace: container users are mapped to host user IDs, providing isolation.

    • IPC namespace: separate shared memory, semaphores, and message queues.

    • Mounts namespace: containers mount their own root filesystem and can attach host/local or remote mounts.

    • UTS namespace: container hostname and time synchronization with host.

  • The combination of namespaces and cgroups provides robust containment for container processes.

Container images and the layering model

  • Container images are templates that include files (binaries, libraries, configurations) required by processes.

  • Immutability: base layers are read-only; modifications are stored in a read-write container layer on top.

  • Layering and sharing:

    • Images are built from multiple layers; layers are shared across images to reduce duplication.

    • If two images share a common base, that base layer is shared to save disk space.

  • Packaging and distribution:

    • Layers are packaged into .tar files for sharing; images contain the layers plus metadata describing usage (ENTRYPOINT, EXPOSE, USER, etc.).

  • Dockerfiles: recipes used to reproduce image builds; workflow to build artifacts consistently across environments.

  • Registries: centralized stores for image layers and metadata to enable sharing and distribution.

  • Overlay filesystems: underlying mechanism enabling layers to be merged; upper layers override lower layers.

  • Image composition implications:

    • Multiple containers can run from the same image; container layer (read-write) holds ephemeral changes that disappear when the container is removed.

    • Persistent data should be stored outside the container’s writable layer to avoid data loss on container removal.

  • Practical guidance:

    • Avoid placing frequently changing data (logs, runtime state) in the container write layer.

    • Use external persistence mechanisms and proper data management strategies (learn more in Chapter 10).

Overlay filesystems

  • Overlay filesystem concept:

    • A union mount that presents multiple directories as a single coherent filesystem.

    • Upper (writable) layer overrides content from lower layers to reflect changes.

  • Benefits for containers:

    • Enables fast, space-efficient sharing of common base layers while allowing container-specific changes.

    • Supports efficient layering and reuse of image content across containers.

Dynamism in container-based applications

  • Networking dynamics:

    • Containers share a network namespace; each new container typically receives a new IP address from IPAM (IP Address Management).

    • IPs are dynamic; avoid hard-coding container IP addresses in configurations.

    • Service discovery is achieved via DNS or service names rather than fixed IPs.

  • Addressing and publishing:

    • Service names and DNS allow stable addressing in dynamic environments.

    • NAT and port mappings handle external access to containerized services.

  • Persistent data considerations:

    • Avoid relying on container write layers for persistent data; plan for persistent storage outside container layers.

  • Tooling context:

    • Tools for creating, running, and sharing containers include Docker, Kubernetes, and other orchestration technologies discussed later.

Tools for managing containers

  • Container runtimes:

    • Run containers on a host; manage their lifecycle, images, and metrics.

  • Docker as a comprehensive toolset:

    • Docker provides a client-server model; Docker Engine (daemon) and Docker CLI used together.

    • Docker Desktop integrates a container runtime and a minimal Kubernetes, suitable for development.

  • Alternatives and ecosystem:

    • Podman: rootless, daemon-less container engine; emphasizes security by running without a background daemon.

    • Nerdctl: CLI for containerd-based runtimes; used in conjunction with containerd.

  • Desktop environments:

    • Docker Desktop and Rancher Desktop enable container runtimes on desktops with a managed UI and CLI.

  • RBAC and security implications:

    • Runtimes themselves do not provide RBAC; orchestration platforms like Kubernetes provide RBAC for access control.

    • Desktop environments can operate with embedded orchestration (e.g., Kubernetes in Docker Desktop).

Windows containers

  • Windows container models:

    • Hyper-V Linux Containers (old model): runs Linux inside a Hyper-V VM on Windows.

    • Windows Server Containers (Windows Process Containers): native Windows containers on Windows hosts.

  • Management parity:

    • Management and execution are similar across models; isolation differs (process-level vs VM-level).

  • Image size considerations:

    • Windows images tend to be larger due to DLLs and Windows-specific resources.

  • Adoption and ecosystem:

    • Windows Server containers were popular during early Windows container days but declined as Linux container support matured.

    • Kubernetes now supports Windows Server hosts as workers, enabling Windows containers in orchestrated clusters; still less common than Linux-focused deployments.

Security improvements using software containers

  • Defense-in-depth approach:

    • Run containers on hosts with minimal attack surfaces; dedicated hosts for container workloads improve security.

    • Consider minimal OS images (CoreOS, RancherOS, PhotonOS, Talos, Flatcar) or build minimal custom OS images (LinuxKit).

  • Runtime security practices:

    • Restrict access to container runtimes; sockets should be protected and TLS should be used for remote access.
      -RBAC is not provided by all runtimes; add RBAC via orchestration tools where appropriate (Kubernetes provides RBAC).

  • Host and container security integration:

    • Security modules like SELinux or AppArmor help constrain container access; non-default labels may be required for certain operations.

  • Practical security notes for developers:

    • Containers typically run as root inside the container; ensure appropriate security measures to limit host access.

    • Prepare security configurations for special requirements, such as host log access or kernel module loading.

Comparing virtualization and containers

  • Visual comparison: virtual guest nodes vs. containers on a host:

    • Virtualization: physical host runs a hypervisor, with each guest VM running its own OS and kernel; this is heavier in resources and slower to scale.

    • Containers: share host kernel; run as processes with isolated namespaces and cgroups; lighter and faster to start.

  • Pros/cons in microservice contexts:

    • Virtual machines provide strong isolation but higher overhead and slower scaling; containers offer elasticity and speed ideal for microservices.

    • Containers decouple from the underlying OS; VMs are tied to their own guest OS instances.

  • Elasticity and scale concerns:

    • Containers better fit elastic environments where components scale up/down on demand.

  • Modern orchestration needs:

    • Orchestrators manage resources, persistence, networking, and communication for containers across clusters.

Building, sharing, and running containers

  • Build-ship-run paradigm (Docker’s slogan):

    • Build container images, share via registries, run containers consistently across environments.

    • CI/CD pipelines build development images, test integrations, and push release images through registries.

  • Reproducibility and workflow:

    • Dockerfiles drive reproducible builds; the same recipe used locally and in CI/CD pipelines.

    • Layered images facilitate sharing and reduce transfer size since only new layers are downloaded when updating.

  • Artifact consistency and environments:

    • Images ensure consistent artifacts across development, testing, staging, and production.

  • Tooling and integration:

    • Docker CLI remains a common interface; orchestration platforms (Kubernetes, Swarm) manage multi-container deployments at scale.

Explaining Windows containers (summary)

  • Windows container models and coexistence with Linux containers:

    • Windows Server containers vs Hyper-V Linux containers serve different isolation and compatibility needs.

    • Docker tooling supports both, with client-server models and remote runtimes.

  • Kubernetes support evolution:

    • Initial Windows container support in Kubernetes lagged behind Linux; now Windows Server containers can run on Kubernetes worker nodes, though Linux prevalence remains higher.

  • Practical takeaway:

    • Windows containers are useful for workloads tightly integrated with Windows ecosystems (e.g., .NET on Windows), but Linux containers dominate most container-native workflows.

Improving security using software containers (practical security practices)

  • Security-by-default posture:

    • Run on minimal hosts with limited services and reduced attack surfaces.

    • Use TLS to secure socket connections to container runtimes.

  • Access control and RBAC:

    • Runtime environments may lack RBAC; rely on orchestration platforms like Kubernetes for granular access control.

  • Runtime security configurations:

    • Runtimes apply a default Seccomp profile to constrain system calls inside containers.

    • Additional capabilities can be requested via container configurations; avoid privileged containers unless absolutely necessary.

  • Host-level security controls:

    • Use Linux Security Modules (LSMs) such as SELinux or AppArmor to restrict container access to host resources.

    • Ensure that container runtimes are integrated with LSMs and that containers adhere to restricted SELinux labels when required.

  • Operational guidance:

    • Keep container hosts secure and minimal; isolate workloads to separate hosts when possible.

    • Plan for specialized security configurations for persistent storage or host log access.

Labs, Windows, and practical installation notes (Chapter-specific labs and demos)

  • Labs overview:

    • Chapter 1 includes hands-on labs to install and run a development container environment (Docker Desktop with a minimal Kubernetes setup).

    • Labs focus on quick, practical setup to validate container runtimes and basic workflows.

  • Docker Desktop lab setup notes:

    • Supports Windows 10 (and other platforms) with Hyper-V or WSL 2 backends; WSL2 is recommended for better performance.

    • Requires enabling WSL2 on Windows and installing a Linux distribution (Ubuntu via Microsoft Store is common).

    • After installation, you verify the runtime by running docker commands, e.g., docker info and docker run -ti alpine to inspect a container and its filesystem.

  • Windows-specific notes in labs:

    • Windows containers can be managed via the same client-server model; Windows Server containers and Hyper-V Linux containers offer different isolation strategies.

    • Kubernetes support for Windows clusters exists but is more commonly used with Linux-based workloads.

  • Practical steps highlighted in the labs:

    • Install Docker Desktop, enable WSL2 integration, and verify the Docker client-server integration.

    • Run a small container (e.g., Alpine) to inspect process trees and root filesystem.

    • Inspect images and containers via the Docker Desktop UI to understand layering and lifecycle.

Summary (chapter takeaway)

  • Containers provide resilience, high availability, scalability, and portability by isolating processes with kernel features rather than virtualizing hardware.

  • Containers fit naturally into modern microservices architectures by enabling independent, language-agnostic components with fast startup and scalable lifecycles.

  • Core concepts introduced include:

    • Kernel namespaces and cgroups for isolation and resource control; container main processes run as PID 1
      inside their own namespace.

    • Image layering, immutability, and overlay filesystems that enable efficient sharing and reproducibility.

    • Container runtimes (low-level and high-level) and their role in executing and orchestrating containers.

    • Container images, registries, and Dockerfiles as reproducible templates for environments.

    • Container orchestration (Kubernetes, Docker Swarm) to manage containers across clusters with scheduling, persistence, and networking.

    • Windows containers and their evolution, with a focus on how Linux containment concepts transfer to Windows.

    • Security by design: minimal hosts, Seccomp profiles, RBAC integration via orchestration, and LSMs like SELinux/AppArmor.

  • Roadmap implied by the chapter:

    • Next: deep dive into building container images (Chapter 2), followed by running containers and orchestration topics in later chapters.

Key figures and terms to remember

  • IPAM: IPAM (IP Address Management) used to dynamically assign IPs to containers.

  • REST: Representational State Transfer; common API design for microservice communication.

  • NAT: Network Address Translation; used for exposing services to external clients.

  • Overlay filesystem: overlay ilesystem used to compose image layers.

  • cgroups: cgroups – Linux kernel feature for resource control.

  • namespaces: namespaces – Linux kernel feature for process isolation (PID, network, IPC, mounts, UX).

  • Seccomp: Seccomp – Linux security facility for restricting system calls inside containers.

  • RBAC: RBAC – Role-Based Access Control (provided by orchestration platforms).

  • Dockerfile: recipe to reproducibly build container images.

  • Kubernetes, Swarm: container orchestration platforms; Kubernetes is the most widely used in enterprise contexts.

  • OpenShift, Rancher, Tanzu: examples of enterprise Kubernetes platforms and distributions.

  • OpenStack: early open-source cloud infrastructure platform enabling on-premises cloud-like capabilities.