Distributed Database Simulator

\ud83e\udd14 What Is a Distributed Database?

A distributed database spreads data across multiple servers (nodes) in different locations. The CAP theorem says you can only guarantee two of three properties: Consistency (every read sees the latest write), Availability (every request gets a response), and Partition Tolerance (the system works despite network splits).

Why does this matter? Every major internet service from Google to Netflix relies on distributed databases. Understanding the tradeoffs between consistency and availability is essential for building reliable, scalable systems that serve billions of users.

📖 Deep Dive

Analogy 1

Imagine a library system with branches in 5 cities, all sharing the same catalog. When someone checks out a book at Branch A, all other branches need to update their records. If the phone line to Branch C goes down (a network partition), Branch C might still show the book as available — that is a consistency problem. You could lock all lending until Branch C reconnects (choosing consistency), or let each branch keep lending independently and reconcile later (choosing availability).

Analogy 2

Think of a group chat with friends across different countries. When you send a message, everyone should see it in the same order. But if someone's internet drops, they miss messages and might reply to outdated information — creating conflicting conversation threads. Distributed databases face this exact problem at massive scale, and use techniques like vector clocks and CRDTs to sort out the chaos when everyone reconnects.

🎯 Simulator Tips

Beginner

Start with 5 nodes and Replication Factor 3 — this means each piece of data is stored on 3 of the 5 nodes

Intermediate

Compare Write Consistency ALL vs ONE — ALL is slower but guarantees every replica has the latest data

Expert

Disable Vector Clocks and trigger concurrent writes during a partition to observe unresolved conflicts

📚 Glossary

CAP Theorem

Brewer's theorem stating that a distributed system can provide at most two of three guarantees: Consistency, Availability, and Partition Tolerance. During a network partition, you must choose between C and A.

Replication Factor

The number of copies of each piece of data maintained across the cluster. A replication factor of 3 means every write is stored on 3 different nodes.

Quorum

A majority of replicas that must acknowledge a read or write for the operation to succeed. For RF=3, quorum is 2. Quorum reads combined with quorum writes guarantee consistency.

Consistency Level

Determines how many replicas must respond before a read or write is considered successful. ONE is fastest but may return stale data; ALL is slowest but always consistent.

Network Partition

A failure in network communication that splits the cluster into two or more groups of nodes that cannot communicate with each other, forcing a choice between consistency and availability.

Split-Brain

A dangerous condition where a network partition causes two groups of nodes to independently accept conflicting writes, believing themselves to be the authoritative cluster.

Vector Clock

A data structure that tracks causal ordering of events across distributed nodes. Each node maintains a logical counter, and comparing vector clocks reveals whether events are causally related or concurrent.

CRDT

Conflict-free Replicated Data Type — a data structure designed so that concurrent updates on different replicas always converge to the same state without coordination, using mathematical properties like commutativity.

Last-Write-Wins (LWW)

A simple conflict resolution strategy where the write with the latest timestamp wins. Easy to implement but can silently discard concurrent updates.

Gossip Protocol

A peer-to-peer communication pattern where each node periodically exchanges state information with a random peer, eventually propagating updates to all nodes — inspired by how rumors spread.

Anti-Entropy

A background repair process where nodes periodically compare their data with peers and synchronize differences, ensuring eventual consistency even after failures.

Merkle Tree

A hash tree data structure used to efficiently detect differences between replica data. Nodes compare root hashes first, then drill down to find and repair only the specific keys that differ.

Tombstone

A marker indicating that data has been deleted. Tombstones must be retained for a TTL period to prevent deleted data from reappearing when a partitioned node with the old data rejoins the cluster.

Eventual Consistency

A consistency model guaranteeing that if no new updates are made, all replicas will eventually converge to the same value. The propagation delay is the time until convergence.

Write-Ahead Log (WAL)

A durability technique where changes are written to a sequential log before being applied to the database, enabling crash recovery by replaying the log.

🏆 Key Figures

Eric Brewer (2000)

Formulated the CAP Theorem at UC Berkeley, fundamentally shaping distributed database design by proving the impossibility of simultaneously guaranteeing consistency, availability, and partition tolerance

Leslie Lamport (1998)

Created the Paxos consensus algorithm and pioneered the theory of distributed systems with concepts like logical clocks, Lamport timestamps, and the Byzantine Generals Problem

Diego Ongaro (2014)

Designed the Raft consensus algorithm at Stanford, making distributed consensus understandable and practical for real-world systems like etcd and CockroachDB

Werner Vogels (2007)

Led the design of Amazon's Dynamo, which introduced consistent hashing, vector clocks, and sloppy quorums — inspiring Cassandra, Riak, and the entire NoSQL movement

Jeff Dean & Wilson Hsieh (2012)

Co-designed Google Spanner, the first globally distributed database with externally consistent transactions using TrueTime API and GPS/atomic clock synchronization

Marc Shapiro (2011)

Pioneered Conflict-free Replicated Data Types (CRDTs) at INRIA, providing mathematical foundations for coordination-free eventual consistency

🎓 Learning Resources

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services [paper]
The formal proof of the CAP theorem, establishing the fundamental impossibility result for distributed systems (ACM SIGACT News, 2002)
Dynamo: Amazon's Highly Available Key-value Store [paper]
Amazon's influential paper on eventually consistent distributed storage using consistent hashing, vector clocks, and sloppy quorum (SOSP 2007)
Spanner: Google's Globally Distributed Database [paper]
Google's TrueTime-based globally consistent database achieving external consistency through GPS and atomic clock synchronization (OSDI 2012)
In Search of an Understandable Consensus Algorithm (Raft) [paper]
The Raft consensus algorithm designed for understandability, now used in etcd, CockroachDB, and TiKV (USENIX ATC 2014)
Designing Data-Intensive Applications [article]
Martin Kleppmann's comprehensive guide to distributed systems fundamentals, replication, partitioning, and consistency models
Jepsen.io [article]
Kyle Kingsbury's rigorous testing and analysis of distributed database correctness claims under real failure conditions
The Raft Consensus Algorithm Visualization [article]
Interactive visualization of the Raft consensus algorithm with step-by-step leader election and log replication

💬 Message to Learners

Distributed databases are the backbone of the modern internet. Every time you post on social media, stream a video, or make an online purchase, dozens of distributed database nodes are coordinating behind the scenes. The CAP theorem teaches us that perfection is impossible in a world where networks fail — and that engineering is about making intelligent tradeoffs. The concepts you explore here — replication, consensus, conflict resolution — are the same challenges that engineers at Google, Amazon, and Netflix solve every day. Understanding these fundamentals will give you a deep appreciation for how the digital world actually works, and perhaps inspire you to build the next generation of resilient data systems.

Get Started

Free, no signup required

Get Started →