• Single-machine DB apps need: user interface, RAM app logic, permanent storage
• Distributed DBs keep same 3 functions but run each on clusters called tiers
• n\text{-tier} ( n>2 ) model dominates; common 3-tier names: client / application / database
• Interfaces between tiers decouple components → easier scaling & heterogeneous hardware
• Key gains:
– Scalability: add machines per tier for more users, data, or workloads
– Modularity: hardware & software can be upgraded independently
• Relational, n\text{-tier}; pairs Oracle Database (disk) with in-RAM TimesTen instances
• Typical cache goal: 90\%+ of requests served from RAM
• Ops flow:
– Read ⇒ check RAM → disk only if miss
– Write ⇒ modify RAM, log, later flush to disk
• Leading NoSQL document DB (~45\% market share) using BSON (binary JSON)
• Strengths: flexible schema, JavaScript-friendly, rich tooling/community
• Ideal for: web content, mixed media records, rapid dev
• Trade-offs:
– 16\text{MB} max document size
– Eventual consistency (not strict ACID)
– No native joins; higher RAM/disk usage
• Copies across nodes may diverge temporarily
• System ensures they converge over time; suits likes, posts, cached pages where millisecond precision isn’t critical
• Built for FB inbox load: high-volume small reads/writes, semi-structured data
• Wide-column store: each column holds a set; queryable without scanning full record
• Open-sourced, now ASF project; used by Netflix, Twitter, Instagram
• Photo storage overhaul (2009)
• Two DBs:
– Disk-based key–value for images (append-only)
– RAM store for compact metadata (id embeds location & visibility)
• Removed 2 of 3 disk seeks → ~3× capacity, lower CDN spend, faster delivery
• Layer atop MySQL to manage massive, high-traffic workload
• Functions:
– Query parsing: detect overlap, serve via cache
– Caching of video rows & lookup lists
– Sharding + VReplication: horizontal/vertical splits; per-shard schema tweaks without global redesign
• Globally distributed SQL+NoSQL hybrid; holds multiple DBs
• Data split into splits; \ge 3 replicas across zones for durability
• Paxos consensus: leader handles writes, commit on majority success
• Stores hot data in Spanner, persists to Colossus filesystem (HDD+SSD mix)
• TrueTime API: hardware (atomic/GPS clocks) bounds clock uncertainty → ordered, consistent transactions
• Modular: Spanner manages indexing & consistency; upper layers choose relational/NoSQL models, SQL variants, policies