\ud83e\udd14 What Is a Distributed Database?
A distributed database spreads data across multiple servers (nodes) in different locations. The CAP theorem says you can only guarantee two of three properties: Consistency (every read sees the latest write), Availability (every request gets a response), and Partition Tolerance (the system works despite network splits).
Why does this matter? Every major internet service from Google to Netflix relies on distributed databases. Understanding the tradeoffs between consistency and availability is essential for building reliable, scalable systems that serve billions of users.
📖 Deep Dive
Analogy 1
Imagine a library system with branches in 5 cities, all sharing the same catalog. When someone checks out a book at Branch A, all other branches need to update their records. If the phone line to Branch C goes down (a network partition), Branch C might still show the book as available — that is a consistency problem. You could lock all lending until Branch C reconnects (choosing consistency), or let each branch keep lending independently and reconcile later (choosing availability).
Analogy 2
Think of a group chat with friends across different countries. When you send a message, everyone should see it in the same order. But if someone's internet drops, they miss messages and might reply to outdated information — creating conflicting conversation threads. Distributed databases face this exact problem at massive scale, and use techniques like vector clocks and CRDTs to sort out the chaos when everyone reconnects.
🎯 Simulator Tips
Beginner
Start with 5 nodes and Replication Factor 3 — this means each piece of data is stored on 3 of the 5 nodes
Intermediate
Compare Write Consistency ALL vs ONE — ALL is slower but guarantees every replica has the latest data
Expert
Disable Vector Clocks and trigger concurrent writes during a partition to observe unresolved conflicts
📚 Glossary
🏆 Key Figures
Eric Brewer (2000)
Formulated the CAP Theorem at UC Berkeley, fundamentally shaping distributed database design by proving the impossibility of simultaneously guaranteeing consistency, availability, and partition tolerance
Leslie Lamport (1998)
Created the Paxos consensus algorithm and pioneered the theory of distributed systems with concepts like logical clocks, Lamport timestamps, and the Byzantine Generals Problem
Diego Ongaro (2014)
Designed the Raft consensus algorithm at Stanford, making distributed consensus understandable and practical for real-world systems like etcd and CockroachDB
Werner Vogels (2007)
Led the design of Amazon's Dynamo, which introduced consistent hashing, vector clocks, and sloppy quorums — inspiring Cassandra, Riak, and the entire NoSQL movement
Jeff Dean & Wilson Hsieh (2012)
Co-designed Google Spanner, the first globally distributed database with externally consistent transactions using TrueTime API and GPS/atomic clock synchronization
Marc Shapiro (2011)
Pioneered Conflict-free Replicated Data Types (CRDTs) at INRIA, providing mathematical foundations for coordination-free eventual consistency
🎓 Learning Resources
- Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services [paper]
The formal proof of the CAP theorem, establishing the fundamental impossibility result for distributed systems (ACM SIGACT News, 2002) - Dynamo: Amazon's Highly Available Key-value Store [paper]
Amazon's influential paper on eventually consistent distributed storage using consistent hashing, vector clocks, and sloppy quorum (SOSP 2007) - Spanner: Google's Globally Distributed Database [paper]
Google's TrueTime-based globally consistent database achieving external consistency through GPS and atomic clock synchronization (OSDI 2012) - In Search of an Understandable Consensus Algorithm (Raft) [paper]
The Raft consensus algorithm designed for understandability, now used in etcd, CockroachDB, and TiKV (USENIX ATC 2014) - Designing Data-Intensive Applications [article]
Martin Kleppmann's comprehensive guide to distributed systems fundamentals, replication, partitioning, and consistency models - Jepsen.io [article]
Kyle Kingsbury's rigorous testing and analysis of distributed database correctness claims under real failure conditions - The Raft Consensus Algorithm Visualization [article]
Interactive visualization of the Raft consensus algorithm with step-by-step leader election and log replication