Cloud Computing Overview: Cloud computing provides on-demand access to IT resources via a web interface. Key traits include:
On-demand and self-service: Access without human intervention.
Internet access: Resources accessible from anywhere.
Resource pooling: Providers allocate from a large pool, passing savings to users.
Elasticity: Resources scale up or down based on needs.
Pay-as-you-go model: Pay only for resources used.
Evolution of Cloud Computing:
First Wave (Colocation): Renting physical space for data centers.
Second Wave (Virtualized Data Centers): Virtual allocation of resources.
Third Wave (Container-Based Architecture): Fully automated, elastic cloud model.
Future of Cloud Computing:
Every company will rely on technology, with software playing a key role.
Data will be central, making every company a "data company".
Virtualized Data Centers and New Offerings:
IaaS (Infrastructure as a Service): Provides raw compute, storage, and network capabilities. Example: Google Cloud’s Compute Engine.
PaaS (Platform as a Service): Binds code to libraries, focusing on application logic. Example: Google Cloud’s App Engine.
Payment Models: IaaS pays for allocated resources, PaaS pays for used resources.
Shift Toward Managed Infrastructure and Services: Enables companies to focus on business goals rather than technical infrastructure.
Serverless Computing: Eliminates infrastructure management. Examples: Cloud Run and Cloud Run functions.
SaaS (Software as a Service): Delivers entire cloud-based applications over the internet. Examples: Gmail, Docs, and Drive.
Summary:
PaaS: Tools to build, but already set up.
IaaS: Rent the land and build everything.
SaaS: Pre-built software.
Key Differences:
Feature | PaaS | IaaS | SaaS |
Control | Limited | Full | No control |
Target | Developers building apps | IT teams needing flexible infrastructure | End-users needing ready-to-use software |
Management | Managed by the provider | Managed by user | Managed by the provider |
Flexibility | Less flexibility in configuration | High customization | Limited flexibility |
Example | Google App Engine, Azure App Service | AWS EC2, Google Compute Engine, Azure VMs | Google Workspace, Salesforce |
Google Cloud’s Global Network: Provides high throughput and low latencies using a large global network with content caching nodes.
Geographic Locations: Spans North America, South America, Europe, Asia, and Australia.
Regions: Independent geographic areas.
Zones: Where resources are deployed.
Multi-Region Configurations: Resources replicated across multiple regions. Example: Spanner multi-region configurations.
Current Infrastructure: 121 zones in 40 regions, with continuous expansion.
Energy Use in Data Centers: Data centers consume about 2% of the world’s electricity. Google aims for energy efficiency.
Google's Environmental Commitment:
First to achieve ISO 14001 certification.
Example: Hamina, Finland Data Center uses seawater cooling.
Carbon Neutrality and Renewable Energy:
First major company to become carbon neutral.
First to achieve 100% renewable energy.
Goal: operate carbon-free by 2030.
Hardware Infrastructure Layer: Custom hardware, secure boot stack, physical security.
Service Deployment Layer: Encrypted inter-service communication.
User Identity Layer: Advanced authentication with risk-based challenges and secondary factors.
Storage Services Layer: Encryption at rest with centrally managed keys.
Internet Communication Layer: Google Front End (GFE) manages TLS connections and provides DoS protection.
Operational Security Layer: Intrusion detection, insider risk mitigation, secure development practices, vulnerability rewards program.
Key Features to Prevent Lock-In:
Open Source Ecosystems: Google shares technologies like TensorFlow.
Interoperability Tools: Kubernetes & Google Kubernetes Engine (GKE), and Google Cloud Observability.
Ensures customers can move workloads if Google no longer meets their needs.
Pricing Features:
Per-Second Billing: For Compute Engine, Google Kubernetes Engine, Dataproc, and App Engine flexible environment VMs.
Sustained-Use Discounts: Automatic discounts for VMs running over 25% of a billing month.
Custom VM Types: Tailor vCPU and memory.
Pricing Calculator: Online tool for cost estimation.
Cost Control:
Budgets and Alerts: Define budgets with alerts at thresholds.
Reports: Visual tools in the console.
Quotas:
Rate Quotas: Limit API calls within a time frame.
Allocation Quotas: Limit resources per project.
Google Cloud Resource Hierarchy:
Resources: Virtual machines, storage buckets, etc., organized into projects.
Projects: Basis for enabling services, billing, collaborators.
Folders: Group projects, allows policy assignment.
Organization Node: Top level, encompasses all folders, projects, and resources.
Policy Management: Policies applied at project, folder, or organization node levels and inherit downward.
Organization Node Creation: Automatically generated for Google Workspace users or via Cloud Identity.
Identity and Access Management (IAM): Manages access to folders, projects, and resources.
Principal: "Who" (Google account, group, service account).
Roles: Define what actions principals can perform.
Deny Rules: Prevent certain actions regardless of assigned roles.
Types of Roles:
Basic Roles: Broad permissions (Viewer, Editor, Owner, Billing Administrator).
Predefined Roles: Specific permissions for Google Cloud services.
Custom Roles: Granular control.
Service Accounts: Allow VMs to access cloud services automatically, identified by an email address and authenticate with cryptographic keys.
Cloud Identity: Centralized user and group management via the Google Admin Console.
Google Cloud Console: Web-based GUI for management, deployment, scaling, and diagnostics.
Cloud SDK and Cloud Shell:
Cloud SDK: Command-line tools, including gcloud CLI.
Cloud Shell: Browser-based command-line access with pre-installed tools.
APIs: Programmatic control via Google APIs Explorer and Cloud Client Libraries.
Google Cloud App: Manages Compute Engine, Cloud SQL, App Engine, and monitors billing.
Virtual Private Cloud (VPC) Networking:
Private, secure cloud-computing model within a public cloud.
Connects resources to each other and the internet.
Global scope, with subnets in any region.
Compute Engine: Google Cloud's IaaS solution for creating and running VMs.
No upfront investment, pay-as-you-go.
Configurable CPU, memory, storage, and OS.
Offers a Cloud Marketplace for solutions from Google and third-party vendors.
Scaling Virtual Machines:
Choose machine properties (predefined or custom).
Autoscaling: Add or remove VMs based on load.
Load balancing: Distribute traffic among VMs.
Important VPC Compatibilities:
Routing Tables: Built-in, forward traffic without external IPs.
Firewall: Global, control traffic using network tags.
VPC Peering: Allows traffic exchange between VPCs.
Cloud Load Balancing: Distributes traffic across multiple instances, fully managed, handles various traffic types, and provides cross-region load balancing.
Application Load Balancers: (Layer 7), for HTTP/HTTPS traffic.
Network Load Balancers: (Layer 4), for TCP/UDP.
Cloud DNS and Cloud CDN:
Cloud DNS: Managed DNS service.
Cloud CDN: Uses global edge caching to accelerate content delivery.
Connecting Networks to Google VPC:
Cloud VPN and Cloud Router: Secure tunnel connection with dynamic route exchange.
Peering Options: Direct or Carrier Peering.
Interconnect Options: Dedicated or Partner Interconnect.
Cross-Cloud Interconnect: High-bandwidth connections to other cloud providers.
Cloud Storage: Object storage for various types of data.
Objects: Store data, metadata, and a unique identifier in buckets.
Use Cases: Website content, backup, disaster recovery, large data distribution.
Versioning: Retains a history of changes.
Access Control: IAM roles and Access Control Lists (ACLs).
Lifecycle Management: Policies to manage costs.
Storage Classes:
Standard Storage: Frequently accessed data.
Nearline Storage: Infrequently accessed data.
Coldline Storage: Data accessed every 90 days.
Archive Storage: Data accessed less than once a year.
Autoclass: Automatically transitions data to appropriate storage classes.
Data Transfer Options: Storage Transfer Service and Transfer Appliance.
Cloud SQL: Fully managed relational database service (MySQL, PostgreSQL, SQL Server).
Automates tasks, scalable, replication, backups, security, and integrates with other Google Cloud services.
Spanner: Fully managed relational database service, scales horizontally, strongly consistent, SQL-compatible, and used for mission-critical applications.
Firestore: Horizontally scalable NoSQL cloud database, with data stored in documents and collections.
Features data synchronization and offline capability.
Bigtable: NoSQL big data database service for massive workloads, low latency and high throughput.
Containers Overview: Provide independent scalability and abstraction of OS/hardware.
Lightweight compared to VMs.
Key benefits: flexibility, portability, efficiency.
Use cases: scaling, microservices, dynamic resource management.
Kubernetes: Open-source platform for managing containerized workloads.
Orchestrates containers, enables microservices scaling, and supports easy rollouts/rollbacks.
Key concepts: Pod, Deployment, Service.
Scaling and updates: manual, autoscaling, declarative configuration.
Google Kubernetes Engine (GKE): Managed Kubernetes service, simplifies setup and management.
Autopilot mode (Google-managed) or standard mode (user-managed).
Cluster creation: Google Cloud Console or gcloud CLI.
Uses Kubernetes commands for management.
Cloud Run: Managed compute platform for stateless containers, serverless, scales fast, and cost-efficient.
Based on Knative, supports container and source-based workflows.
Charges only for used resources (100 ms granularity).
Cloud Run Functions: Event-driven, lightweight compute solution for single-purpose functions.
Execute in response to cloud events, asynchronous execution.
Cost-efficient, scales seamlessly, supports multiple languages.
Generative AI Overview: Creates content using prompts.
LLMs are trained on massive datasets.
Prompt engineering is structuring input text to maximize model response quality.
Prompt types: zero-shot, one-shot, few-shot, role prompts.
Components: preamble and input.
Minimizing errors: train on high-quality data, clear prompts.
Google Cloud and Gemini: Gemini is an integrated generative AI tool in Google Cloud.
Google Cloud Console: A web-based GUI for managing resources.
Google Cloud CLI (gcloud): A command-line interface for managing resources using commands.
Cloud Shell: A browser-based, interactive shell with a temporary VM, 5 GB of persistent storage, and pre-installed tools.
Client Libraries & APIs: Language-specific libraries and APIs for interacting with Google Cloud services.
Cloud Mobile App: An application for managing Google Cloud services on mobile devices.
Cloud Storage Buckets: Storage locations, globally unique, with public access prevention.
Cloud Shell:
Provides 5 GB persistent storage in /home
.
Command-line access to a temporary Compute Engine VM.
Pre-installed tools like gcloud, kubectl, and bq.
Built-in authorization for accessing resources.
Web preview functionality.
Toolbar includes minimize/restore, new window, and close terminal.
Cloud Shell Tasks: Creating buckets, uploading files, setting region environment variables, and configuring persistent variables.
Interface Exploration:
Console: Fast task execution, guided options, validation.
Cloud Shell: Detailed control, full features, scripting automation.
Virtual Private Cloud (VPC): Allows provisioning, connecting, and isolating resources within Google Cloud.
VPC Components:
Projects: Encompass all services, including networks.
Networks: Three types: Default, Auto Mode, and Custom Mode.
Subnetworks: Segments of a network.
Regions and Zones: Data centers ensuring data protection and high availability.
VPC: Provides internal and external IP addresses.
Focus: Configuring VMs, routes, and firewall rules.
Projects:
Organize resources and billing.
Contain networks; default quota is 15.
Networks can be shared or peered with other projects.
Networks:
Global scope.
Types:
Default: Preset subnets and firewall rules.
Auto Mode: One subnet per region using predefined IP ranges (10.128.0.0/9 CIDR).
Custom Mode: Full control over subnets and IP ranges.
Auto mode can convert to custom mode, but not vice versa.
Subnetworks:
Regional scope spanning multiple zones.
Reserve specific IP addresses.
Allow seamless communication between VMs across zones.
IP ranges can be expanded but must not overlap.
IP Addressing:
IPv6 supported in custom VPC networks.
Subnets start with /20 IP ranges, expandable to /16 in auto mode.
Communication and Connectivity:
VMs in the same network use internal IPs; different networks use external IPs with Google Edge routers.
A single VPN connects on-premises to Google Cloud.
Subnet Design: Subnets must not overlap, use valid RFC ranges, and not span multiple RFC ranges.
Internal IP Address: Assigned via DHCP, used by services, and mapped to a symbolic name via internal DNS.
External IP Address (Optional):
Types: Ephemeral and Static.
Higher charges for reserved static IPs not assigned.
Supports Bring Your Own IP (BYOIP) using /24 block or larger.
External and Internal IP Address Handling: External IPs are mapped to the VM's internal IP by the VPC and are unknown to the VM's OS.
DNS Resolution:
Internal DNS: Two types: Zonal and Global. Hostnames resolve to internal IPs; uses instance name as hostname; instances have an internal FQDN.
Metadata server acts as DNS resolver.
External DNS: Not automatically published but can be manually configured.
Cloud DNS: Managed, scalable, and reliable service with Anycast name servers and a 100% uptime SLA.
Alias IP Ranges: Enable assigning multiple internal IP addresses to a VM interface.
Default Routing: Networks have default routes for communication, but custom routes can override.
Firewall Rules: Enforced at the instance level; default rules deny ingress and allow egress.
Firewall Rule Components: Direction, Source/Destination, Protocol/Port, Action, Priority, and Assignment.
Ingress vs. Egress Use Cases:
Egress Rules: Control outbound connections, using CIDR ranges.
Ingress Rules: Control incoming connections, with source restrictions.
Routing Table: Tied to networks and instances, with a virtual router handling routing.
Ingress Traffic: Free unless processed by a resource like a load balancer.
Egress Traffic:
Free within the same zone via internal IP, or to Google services/GCP services in the same region.
Charged between zones, within a zone using external IP, or between regions.
External IP Addresses: Higher cost if static and unassigned.
Preemptible VMs: Lower charges.
Pricing Calculator: Tool for estimating costs.
Availability:
Multiple Zones: Deploy VMs in different zones within the same subnet.
Regional Managed Instance Groups: Increase availability across multiple zones.
Globalization:
Multi-Region Deployment: Spread resources across regions.
Global Load Balancer: Routes traffic to the closest region.
Security:
Internal IPs: Use internal IPs to reduce exposure.
Cloud NAT: Provides outbound NAT for private VMs; does not support inbound NAT.
Private Google Access: Enables VMs with only internal IPs to access Google APIs.
Compute Engine: IaaS model for running any language, managing VMs, and autoscaling.
Disk Options:
Persistent Disks:
Standard HDDs: Cost-effective.
SSD Persistent Disks: Higher performance.
Local SSDs: High throughput and low latency; data persists only during runtime.
Performance: Network throughput scales with CPU cores.
Cloud TPUs: Specialized for machine learning.
Networking: Supports regional HTTPS and network load balancing.
Root Privileges: VM creator has full root access.
Firewall Rules: Default rules allow SSH and RDP access.
VM Lifecycle: Provisioning, Staging, Running, Stopping, Reset, Suspending, and Repairing.
State Transitions: Methods include Google Cloud console, gcloud command, or OS-based commands.
Availability Policies: Live migration is default; can be configured to terminate or auto-restart.
Patch Management: Includes patch compliance reporting and automated deployment.
Billing: No charges for CPU/memory in the terminated state; charges apply for attached disks and reserved IPs.
VM Creation Options: Use Cloud Console, Cloud Shell CLI, or RESTful API.
Machine Families:
General-purpose: Flexible vCPU-to-memory ratios.
E2: Cost-effective.
N2/N2D: High performance and scalability.
Tau T2D/T2A: Optimized for cost-performance on scale-out workloads; supports Arm processors.
Compute-optimized: High performance per core.
C2: Ideal for simulations and gaming.
C2D: Largest VM sizes with high last-level cache.
Memory-optimized: High memory-to-vCPU ratios.
M1/M2: Up to 12 TB of memory for SAP HANA and data analytics.
M3: Suitable for genomic modeling.
Accelerator-optimized: Designed for workloads requiring GPUs.
A2: Great for machine learning.
G2: Optimized for ML training and video transcoding.
Custom Machine Types: Allow unique CPU/memory configurations.
Constraints: only 1 or even-numbered vCPUs, memory in multiples of 256MB.
Cost: E2 VMs are the lowest cost; committed-use and sustained-use discounts are available.
Billing Granularity: Charged per vCPU, GPU, and GB of memory with a 1-minute minimum, then per-second.
Pricing Model: Resource-based pricing, charges vCPUs and memory separately.
Discount Types:
Sustained Use Discounts: Automatic discounts based on monthly usage.
Committed Use Discounts: Up to 57% for most machine types and 70% for memory-optimized.
Preemptible/Spot VMs: Lower cost but subject to termination.
Customization: Custom machine types offer tailored resource allocation.
Free Usage Limits: Limited free resources are available.
Optimization: Google Cloud Pricing Calculator estimates sustained use discounts.
Preemptible VMs: Significant discount, can be preempted with short notice.
Spot VMs: Similar to preemptible VMs with no max runtime but subject to preemption.
Sole-Tenant Nodes: Physical servers dedicated to a project for compliance or isolation.
Shielded VMs: Provides verifiable integrity to prevent boot/kernel malware.
Confidential VMs: Use AMD SEV to encrypt data in use during processing.
VM Boot Disk Images: Includes OS, file system, and pre-configured software.
Public Images: Include Linux and Windows options.
Premium images charged per-second or per-minute.
Custom Images: Allows pre-installing organization-approved software; can be imported or shared.
Machine Images: Stores configurations, metadata, and data for VM creation.
VM Disk Options: Every VM has a root persistent disk that survives VM termination.
Persistent Disks: Attached via the network, survive termination; support snapshots, resizing and sharing.
Disk Types:
Zonal Persistent Disks: Reliable block storage.
Regional Persistent Disks: Active-active replication across two zones.
Standard Persistent Disks: Suitable for large data workloads.
Performance SSD Persistent Disks: Low-latency, high-IOPS.
Balanced Persistent Disks: Cost-performance balance.
Extreme Persistent Disks: High performance for high-end database workloads.
Encryption:
Data Encryption at Rest managed by Google Cloud by default.
CMEK and CSEK allow managing encryption keys.
Local SSDs: Physically attached; high IOPS but ephemeral.
RAM Disks: Fastest performance but volatile.
Disk Performance & Limits: Persistent disks provide redundancy, up to 128 can be attached to VMs; disk I/O shares throughput with network egress/ingress.
Disk Management: Cloud persistent disks offer no partitioning, automatic redundancy, snapshots, and encryption.
Metadata Server: Stores metadata used in startup and shutdown scripts; retrieves instance information.
Moving an Instance:
Same Region: Use gcloud compute instances move.
Different Region: Involves snapshots and data transfer.
Snapshots: Incremental backups stored in Cloud Storage; used for backup, migration, and transferring data between disk types.
Can be scheduled for regular automatic backups.
Resizing Persistent Disks: Can be resized while attached to a running VM; can only be grown.
IAM defines who can do what on which resources. It controls permissions for users, groups, or applications.
Who: Identity (person, group, application).
What: Actions or privileges (e.g., view, edit, delete).
Resource: Google Cloud service or asset (e.g., Compute Engine, Cloud Storage).
Cloud IAM Components:
IAM Policies: Define access based on roles.
Resource Hierarchy: Organization, Folders, Projects, Resources. Roles are inherited.
Organization Node: Root of the GCP resource hierarchy.
Organization Admin: Full control over all resources.
Project Creator: Creates projects within the organization.
Linked to G Suite or Cloud Identity Account.
G Suite/Cloud Identity Super Admin: Controls the organization resource lifecycle.
Organization Node Viewer: Read-only access to all organization resources.
Folders: Sub-organizations for isolating projects.
Folder Admin: Full control over folders.
Folder Creator: Creates folders.
Folder Viewer: View access to folders and projects.
Project Roles:
Project Creator: Creates new projects and becomes the owner.
Project Deleter: Deletes projects.
Roles: Named lists of permissions.
Basic Roles:
Owner: Full admin access.
Editor: Modify and delete resources.
Viewer: Read-only access.
Predefined Roles: Granular access to specific services (e.g., Compute Admin, Network Admin).
Custom Roles: Precise permissions following the least privilege model.
Members:
Google Accounts: Developers, administrators, users.
Service Accounts: For applications.
Google Groups: Collections of accounts.
Google Workspace/Cloud Identity Domains: Virtual groups of accounts within organizations.
Access Control:
Allow Policies: Grant access.
Deny Policies: Prevent access.
IAM Conditions: Conditional access based on attributes (e.g., location).
Organization Policies: Restrictions inherited by all descendants, only the organization policy admin role can create exceptions.
Synchronizing External Directories:
Google Cloud Directory Sync: Syncs from Active Directory or LDAP.
Single Sign-On (SSO): Authenticate with an existing system.
Service Accounts: Accounts for applications.
Built-in: Compute Engine Default, Google Cloud APIs.
Custom: User-created, more flexible.
IAM Roles: Define permissions (e.g., InstanceAdmin).
Service Account User: Allows users to act as the service account.
Service Account Keys: Google-managed (auto-rotated) or user-managed.
Organization Restriction: Limits access to authorized Google Cloud organizations.
Managed Devices: Governed by organizational policies.
Egress Proxy Configuration: Adds organization restrictions headers.
IAM Best Practices:
Use resource hierarchy and inheritance.
Apply the principle of least privilege.
Audit policies and group memberships.
Grant roles to groups.
Use clear naming conventions for service accounts and implement key rotation policies.
Use Cloud IAP for centralized authorization.
Cloud Storage: Scalable object storage service.
Buckets: Globally unique containers for objects.
Objects: Files stored in buckets.
Storage Classes:
Standard: For frequently accessed data.
Nearline: For infrequent access (30-day minimum).
Coldline: For less frequent access (90-day minimum).
Archive: For long-term storage (365-day minimum).
Location Types: Multi-region, dual-region, region.
Access Control: IAM Roles, ACLs, Signed URLs.
ACLs: Define scope (who) and permission (what).
Object Lifecycle Management: Automates transitioning objects between storage classes.
Customer-Supplied Encryption Keys (CSEK): Lets you use your own encryption keys.
Object Versioning: Keeps multiple object versions.
Soft Delete: Retains deleted objects for a configurable period.
Retention Policies: Sets minimum retention durations.
Directory Synchronization: Syncs VM directories with buckets.
Object Change Notifications: Configured using Pub/Sub.
Data Transfer Services: Transfer Appliance, Storage Transfer Service, Offline Media Import.
Strong Global Consistency: Ensures reliable results for object operations.
Autoclass: Automatically transitions objects between storage classes based on access patterns.
Filestore: Managed file storage service with a file system interface.
Compatible with NFSv3 clients.
Scalable performance and capacity.
Use cases include: Enterprise Application Migration, Media Rendering, EDA, Data Analytics, Genome Sequencing, Web Content Hosting.
Cloud SQL: Fully managed SQL database service.
Supports MySQL, PostgreSQL, and SQL Server.
High availability with primary and standby instances.
Automated and on-demand backups.
Scaling: Vertical and horizontal (with read replicas).
Connection Types: Private IP, Cloud SQL Auth Proxy, SSL, Unencrypted.
Cloud Spanner: Combines relational database features with non-relational scalability.
Petabyte-scale capacity and global transactional consistency.
Automatic synchronous replication for high availability.
Ideal for mission-critical systems.
Uses atomic clocks for consistent updates.
AlloyDB: PostgreSQL-compatible database designed for hybrid transactional and analytical workloads.
Uses machine learning for optimization.
High transaction and analytical processing speeds.
High availability with a 99.99% uptime SLA.
Integrated with Vertex AI for machine learning.
Cloud Firestore: Serverless NoSQL document database.
Offers live synchronization and offline support.
ACID Transactions for consistency.
Multi-region replication for high availability.
Two modes: Datastore mode (backward compatibility) and Native mode (real-time updates).
Cloud Bigtable: Fully managed NoSQL database for petabyte-scale storage.
High throughput and low latency.
HBase API compatibility.
Uses sorted key/value maps.
Linear scalability with added nodes.
Memorystore: Fully managed in-memory data store, compatible with Redis.
High availability and automated management.
Low latency and high throughput.
Easy migration from open-source Redis.
Resource Manager: Hierarchical management of resources.
Resources organized by projects, folders, and organization.
IAM Policies with roles and members, inherited from parent resources.
Deny policies override allow policies.
Billing accumulates bottom-up.
Projects manage resources, billing, permissions, and APIs, and are identified by project name, number, and ID.
Resources are either global, regional, or zonal, and tied to a project.
Quotas: Limits on resource usage per project.
Types: Resource Creation, API Rate, Regional.
Can be adjusted as usage grows.
Prevent runaway consumption, billing surprises, and encourage right sizing.
Quotas don't guarantee resource availability.
Labels: Key-value pairs for organizing resources.
Up to 64 labels per resource.
Use cases: Environment, Team, Component, Owner, State.
Network tags are used for networking purposes.
Billing:
Budgets track spend with alerts for exceeding thresholds.
Pub/Sub notifications for spend updates.
Labels help optimize spend by identifying high-cost areas.
Billing data can be exported to BigQuery for analysis and visualized with Looker Studio.
Operations Suite: Integrated monitoring, logging, diagnostics.
Manages resources across platforms, discovers cloud resources, supports open-source agents, and provides advanced analytics.
Pay-as-you-go pricing and free usage allotments.
Cloud Monitoring: Provides monitoring, insights, and alerts for platform, system, and application metrics.
Metrics Scope: Centralized monitoring configuration with 1-375 projects.
Custom Dashboards: Display metrics with filters, groups, and aggregation.
Alerting Policies: Notifications via email, SMS, etc. when specific conditions occur.
Uptime Checks: Test service availability from global locations.
Ops Agent: Collects telemetry data from Compute Engine instances.
Custom Metrics: Track application-specific metrics.
Autoscaling: Adjusts VM instances based on metrics.
Cloud Logging: Stores, searches, analyzes, monitors, and alerts on logs.
Logs retained for 30 days; can be exported to Cloud Storage, BigQuery, or Pub/Sub.
Cloud Storage: For longer-term log storage.
BigQuery: For fast log analysis and visualization with Looker Studio.
Pub/Sub: Streams logs for real-time processing.
Error Reporting: Counts, analyzes, and aggregates errors in cloud services.
Provides a centralized error management interface with real-time notifications.
Supports various languages and services.
Cloud Trace: Distributed tracing system for collecting latency data from applications.
Tracks how requests propagate through the application.
Provides automatic latency analysis and performance reports.
Cloud Profiler: Continuously analyzes CPU or memory-intensive functions to identify performance issues.
Uses low-impact instrumentation.
Supports multiple platforms and languages.
Partner Integrations:
BindPlane collects metrics and logs, pushing them into Cloud Logging.
Logs can be exported to Splunk via Pub/Sub and Dataflow.
Cloud VPN: Connects on-premises networks to Google Cloud VPC using encrypted IPsec VPN tunnels.
Classic VPN: Suitable for low-volume connections, provides 99.9% SLA, supports site-to-site VPN, static and dynamic routes, and IKEv1/IKEv2 ciphers, but doesn't support client-based "dial-in" VPN. Requires Cloud Router for dynamic routing.
Example: Two VPN gateways (Cloud and on-premises) with two VPN tunnels. MTU for on-premises gateway should not exceed 1460 bytes.
HA VPN: High availability with 99.99% SLA, using two or four tunnels to the peer gateway. Supports dynamic BGP routing, active/active or active/passive configurations, and automatic IP selection for high availability. Can connect to peer VPN devices, AWS virtual private gateways, or other HA VPN gateways.
Redundancy: Can connect to two peer VPN devices for redundancy.
AWS Integration: Uses a transit gateway or virtual private gateway, requiring four tunnels between Google Cloud and AWS.
Connecting Google Cloud VPCs: Two HA VPN gateways connect two Google Cloud VPC networks with two tunnels per gateway.
Cloud Router: Supports static and dynamic routing via BGP, automatically updating and exchanging routes. Uses link-local IPs (169.254.0.0/16) for BGP sessions.
Cloud Interconnect and Peering: Connects infrastructure to Google’s network.
Dedicated vs. Shared: Direct or through a partner.
Layer 2 vs. Layer 3: VLANs for internal IPs or public IPs for Google services.
Cloud Interconnect: Direct physical connections between on-premises networks and Google’s network.
Dedicated Interconnect: Direct, dedicated connections for large data transfers, requires a cross-connect, and offers 99.9% or 99.99% uptime SLA. Supports 10 Gbps or 100 Gbps links.
Partner Interconnect: Connects through a service provider for remote locations, with 99.9% or 99.99% uptime SLA, supporting 50 Mbps to 50 Gbps connections.
Cross-Cloud Interconnect: Connects Google Cloud to other providers (AWS, Azure, etc.), providing 10 Gbps or 100 Gbps connections.
Peering: Connects directly or through a carrier to Google's edge network.
Direct Peering: Direct connection for internet traffic exchange, no SLA, 10 Gbps capacity per link, requires a Google PoP.
Carrier Peering: Connects through a service provider, no SLA, capacity varies by provider.
Choosing a Connection:
Peering: Use for connecting to Workspace services, YouTube, or Google Cloud APIs using public IP addresses.
Choose Direct Peering if requirements are met, otherwise Carrier Peering.
Cross-Cloud Interconnect: Use for high-bandwidth connections to other cloud services.
Cloud VPN: Use for lower bandwidth, encrypted connections.
Interconnect: Use for extending the network to Google Cloud, starting with colocation facilities.
If colocation facilities are not available, choose Cloud VPN or Partner Interconnect.
For modest bandwidth, short-term, and encrypted connections, choose Cloud VPN.
For higher bandwidth needs, choose Partner Interconnect.
If colocation facilities are available, Dedicated Interconnect is ideal, but consider Cloud VPN or Partner Interconnect if encryption is needed or 10 Gbps is too large.
VPN over Interconnect: Google supports using VPN over Interconnect, allowing the choice of Google-managed encryption.
Shared VPC and VPC Peering:
Shared VPC: Shares a network across multiple projects, enabling communication using internal IPs.
Host Project: Contains the Shared VPC network.
Service Projects: Attach to the Shared VPC network.
Centralized control over network resources is maintained by organization admins.
VPC Network Peering: Allows private connectivity between VPC networks, with each VPC controlling its own firewall and routing tables.
Managed Instance Groups (MIGs): Collection of identical VM instances controlled as a single entity.
Automatic Scaling: Adjusts instances based on demand.
Rolling Updates: Updates all instances with a new instance template.
Load Balancing: Distributes traffic across instances.
Self-Healing: Recreates unhealthy instances.
Health Checks: Identifies and recreates unhealthy instances.
Regional MIGs: Recommended for redundancy across multiple zones, while Zonal MIGs are limited to a single zone.
Instance Template: Blueprint for instances in the group.
Autoscaling: Automatically scales instances based on load.
Scaling Policies: Based on CPU, load balancing capacity, monitoring metrics, queue-based workloads, or scheduled actions.
Monitoring: Cloud Monitoring displays metrics like CPU, disk, and network usage.
Health Checks: Monitor instance health using protocol, port, and health criteria.
Stateful IP Addresses: Ensures IP addresses remain unchanged during updates.
HTTP(S) Load Balancing: Operates at Layer 7, routing based on URL content.
Global Load Balancing: Balances traffic across multiple regions using a single anycast IP address.
URL Maps: Route URLs to different back-end services.
Back-end Services: Include health checks, session affinity, timeouts.
HTTPS Load Balancer: Uses a target HTTPS proxy, requires an SSL certificate, and supports QUIC.
Backend Buckets: Use Google Cloud Storage buckets for static content.
Network Endpoint Groups (NEGs): Configure backend endpoints for containerized applications.
Zonal NEG: Contains endpoints like Compute Engine VMs.
Internet NEG: Contains external endpoints.
Serverless NEG: Points to services like Cloud Run.
Hybrid Connectivity NEG: Points to external services using Traffic Director.
Cloud CDN: Caches content at Google's edge locations, reducing latency.
Cache Flow: Requests are forwarded to the backend if not in the cache, then cached for future requests.
Cache Modes:
USE_ORIGIN_HEADERS: Respects cache directives from the origin server.
CACHE_ALL_STATIC: Caches static content automatically.
FORCE_CACHE_ALL: Forces caching of all content.
SSL Proxy Load Balancing: Global load balancing for encrypted non-HTTP traffic, terminating SSL connections at the load balancer.
Supports IPv4 and IPv6, simplifies certificate management, and applies automatic security patches.
TCP Proxy Load Balancing: Global load balancing for unencrypted, non-HTTP traffic, terminating TCP sessions at the load balancer.
Supports IPv4 and IPv6, and automatically applies security patches.
Network Load Balancing: Regional, non-proxied load balancing that balances traffic across VM instances in the same region.
Supports UDP, TCP, and SSL traffic.
Back-End Service-Based Load Balancer: Supports non-legacy health checks, autoscaling, and failover policies.
Target-Pool-Based Load Balancer: Uses target pools, where the load balancer picks an instance based on a hash of the source and destination IPs/ports, with limitations on health checks and region.
Internal Load Balancing: Private load balancing service for TCP and UDP traffic.
Accessible through internal IPs within the same region, lowering latency.
Uses a software-defined solution (Andromeda).
Internal HTTPS Load Balancing: Regional, proxy-based Layer 7 load balancer using Envoy proxy.
Choosing a Load Balancer:
IPv6 Support: HTTP(S), SSL Proxy, and TCP Proxy load balancers.
Application Load Balancer: Best for HTTP(S) traffic.
Proxy Network Load Balancer: Best for TLS offload, TCP proxy, or load balancing across multiple regions.
Passthrough Network Load Balancer: Best for preserving client source IP addresses and supporting protocols like UDP.
External Applications: Use a load balancer that handles public traffic.
Internal Applications: Use internal TCP/UDP load balancing.
Global Applications: Use global load balancing services.
Regional Applications: Use regional load balancing services.
MANAGED: Fully managed service via Google Front Ends or Envoy proxy.
Terraform: An Infrastructure as Code (IaC) tool for automating infrastructure provisioning.
Uses declarative configuration files written in HashiCorp Configuration Language (HCL).
Supports parallel resource deployment and consistent results.
Blocks: Represent objects, can contain arguments and nested blocks.
Arguments: Assign values to resource names.
Expressions: Define values assigned to identifiers.
Terraform Workflow:
terraform init: Initializes the configuration and installs the provider plugin.
terraform plan: Refreshes and checks if the plan matches the desired state.
terraform apply: Applies the defined infrastructure changes.
Configuration Files:
provider.tf: Defines the cloud provider.
mynetwork.tf: Defines VPC network and firewall rules.
instance/main.tf: Defines VM instances module.
instance/variables.tf: Defines input variables for VM instances.
Commands:
terraform fmt: Formats Terraform configuration files.
terraform plan: Creates an execution plan.
Google Cloud Marketplace: Allows quick deployment of software packages with pre-configured deployments based on Terraform.
Supports Bring Your Own License (BYOL).
Google Cloud updates images for critical issues and vulnerabilities.
BigQuery: Serverless, scalable cloud data warehouse for petabyte-scale data with fast SQL queries.
Access via Cloud Console, command-line tool, or REST API.
Dataflow: Managed service for stream and batch data processing using SQL, Java, and Python APIs (Apache Beam SDK).
Supports windowing, session analysis, and various connectors.
Integrates with Stackdriver, Cloud Datastore, Pub/Sub, BigQuery, and AI Platform.
Dataprep: Intelligent data service for visually exploring, cleaning, and preparing data, using UI inputs and automated schema detection.
Operated by Trifacta, integrated with Google Cloud.
Dataproc: Fully managed service for running Apache Spark and Hadoop clusters, with per-second billing and preemptible instances.
Integrates with BigQuery, Cloud Storage, Cloud Bigtable, and Stackdriver.
Choose Dataproc if you have dependencies on the Apache Hadoop or Spark ecosystem, or prefer a hands-on approach; choose Dataflow for a serverless approach.
Containers isolate the user space (applications and dependencies) without virtualizing the entire OS, making them lightweight, efficient, and quick to start/stop.
Key benefits include: code-centric delivery, consistent performance, incremental updates, and support for microservices architecture.
Container images package an application and its dependencies, while a container is a running instance of this image.
Docker is a tool for creating and running containers, but lacks orchestration at scale.
Containers use Linux technologies such as Linux processes, namespaces, cgroups, and union file systems to isolate workloads.
Container images are built in layers defined by a Dockerfile.
Changes within a container are stored in a writable ephemeral layer that is lost when the container is deleted.
Container images are stored in public repositories like Artifact Registry, Docker Hub, and GitLab.
Cloud Build can build container images from source code and deploy them to environments like Google Kubernetes Engine.
Cloud Build uses Docker containers for each build step.
Cloud Build can be automated with a cloudbuild.yaml file.
Kubernetes is an open-source platform for managing containerized workloads and services, enabling orchestration and scaling across multiple hosts.
It supports microservices architecture, rollouts, and rollbacks.
Kubernetes uses declarative configurations, where the desired system state is described and maintained by Kubernetes.
Kubernetes supports various workload types including stateless and stateful applications, batch jobs, and daemon tasks.
It can automatically scale containerized applications based on resource utilization.
Kubernetes is extensible through plugins and add-ons, including Custom Resource Definitions.
It is also portable and can be deployed on-premises or across cloud providers.
Google Kubernetes Engine (GKE) is a managed Kubernetes service on Google's infrastructure.
GKE simplifies the deployment, management, and scaling of Kubernetes environments.
GKE Autopilot mode manages cluster configuration, scaling, and security.
GKE also provides auto-upgrades and node auto-repair.
GKE integrates with services like Cloud Build, IAM, Google Cloud Observability, VPC, and Google Cloud Console.
Kubernetes Object Model: Everything managed in Kubernetes is represented as an object with attributes and state that can be viewed and changed.
Declarative Management: Kubernetes maintains the desired state of objects through a "watch loop".
Objects have a spec (desired state) and a status (current state).
Pods are the smallest deployable units, and containers run inside them.
Containers in the same Pod share resources, network, and storage, communicating via localhost.
Kubernetes continuously monitors the cluster and adjusts the state as necessary.
Kubernetes Control Plane: Coordinates the cluster with key components such as:
kube-APIserver: Main interface for interacting with the cluster.
kubectl: Command-line tool to interact with the kube-APIserver.
etcd: Database that stores the cluster's state.
kube-scheduler: Schedules Pods to nodes based on resource requirements.
kube-controller-manager: Monitors the cluster’s state and makes changes to match the desired state.
Workload Controllers like Deployments manage Pods.
System Controllers manage node failures and interactions with cloud providers.
Nodes run Pods, with the control plane managing cluster operations.
kubelet: Agent on each node that interacts with the kube-APIserver to start Pods.
Container Runtime: Software to launch containers (e.g., containerd).
kube-proxy: Maintains network connectivity among Pods.
GKE vs Kubernetes: GKE is a managed version of Kubernetes that simplifies infrastructure management.
Autopilot Mode: GKE manages node configuration, autoscaling, upgrades, and security.
Standard Mode: User manages node configuration and underlying infrastructure.
Autopilot Mode: Hands-off experience with less configuration, paying only for Pods, with Google managing security and resource optimization.
Restrictions include limited access to node objects, no SSH or privilege escalation, and limitations on node affinity and host access.
All Pods in Autopilot have a Guaranteed class QoS.
Standard Mode: More control over Kubernetes management, paying for all provisioned infrastructure, and requires more manual management.
Recommendation: Autopilot mode is generally preferred for its ease of use, security, and cost-efficiency unless specific control is needed.
Kubernetes Objects: Identified by a unique name and UID, defined using manifest files in YAML or JSON.
Required fields: apiVersion, kind, and metadata.
Labels: Key-value pairs for tagging and organizing objects.
Object Names and UIDs: Object names must be unique within a namespace, and each object has a unique UID generated by Kubernetes.
Managing Desired State: Use controller objects (e.g., Deployments) to manage and maintain the state of Pods since Pods are ephemeral.
Deployments: Ideal for managing long-lived software components; ensures a specified number of replicas of a Pod are always running.
Deployments use ReplicaSets to maintain the desired number of Pods, replacing failed Pods automatically.
kubectl: Command-line utility for administering Kubernetes clusters, communicating with the Kube API server.
It transforms command-line entries into API calls.
kubectl must be configured with cluster credentials, which GKE provides through the gcloud command.
kubectl is used to administer the internal state of a cluster but cannot create or modify clusters.
Command structure: kubectl <command> <type> <name> <flags>
.
Introspection: Gathering information about resources in a Kubernetes cluster to debug issues.
Key commands include:
kubectl get pods
: Checks the status of Pods.
kubectl describe pod <pod-name>
: Provides detailed information about a Pod.
kubectl exec <pod-name> -- <command>
: Executes a command inside a container.
kubectl logs <pod-name>
: Shows logs of a Pod's containers.
exec -it
: Launches an interactive shell inside a container.