System Design

Data Consistency

Understanding consistency models and techniques for managing data in distributed systems.

Consistency in Distributed Systems

In distributed systems, data is replicated across multiple nodes for availability and fault tolerance by leveraging the _Consistent Hashing_ algorithm. This replication introduces challenges in maintaining consistency among all replicas.

Quorum Consensus

Quorum consensus is a technique used to guarantee consistency for both read and write operations in distributed systems with replicated data.

N = The number of replicas

W = A write quorum of size W. For a write operation to be considered successful, the write operation must be acknowledged from W replicas.

R = A read quorum of size R. For a read operation to be considered successful, the read operation must wait for responses from at least R replicas.

Consistency vs. Latency Tradeoff

The configuration of W, R, and N represents a tradeoff between latency and consistency:

  • Low values (W=1 or R=1): Operations return quickly because a coordinator only needs to wait for a response from any replica.
  • Higher values (W>1 or R>1): The system offers better consistency, but queries will be slower because the coordinator must wait for responses from multiple replicas, including the slowest ones.

If W + R > N, strong consistency is guaranteed because there must be at least one overlapping node that has the latest data to ensure consistency.

Common Quorum Configurations

Configuration Optimized For Consistency
R=1, W=N Fast reads Eventual
W=1, R=N Fast writes Eventual
W+R>N (e.g., N=3, W=R=2) Balance Strong
W+R≤N High availability Weak

Depending on system requirements, you can tune these values to achieve the desired balance between consistency and performance.

Quorum Configuration Demo

Experiment with different values of N, W, and R to understand how they affect consistency and performance. Adjust the values below to see the impact on the system's behavior.

Configure Your Quorum Settings

Results

Optimized For:

Balanced

Consistency Level:

Strong consistency

Strong consistency guaranteed: With W=2 and R=2, there's at least one overlapping node with the most recent write.

Remember: When W + R > N, the system guarantees strong consistency because there must be at least one node that participates in both write and read operations.

Consistency Models

A consistency model defines the degree of data consistency in a distributed system. Different models offer varying guarantees about when and how updates become visible.

Strong Consistency

Any read operation returns the result of the most recent write operation. A client never sees out-of-date data.

High consistency, lower availability

Weak Consistency

Subsequent read operations may not see the most updated value. No guarantees on when updates will be visible.

Higher availability, lower consistency

Eventual Consistency

A specific form of weak consistency. Given enough time, all updates are propagated, and all replicas eventually become consistent.

Balance of availability and consistency

Real-world Example

Many distributed systems like Amazon's Dynamo and Apache Cassandra adopt eventual consistency, which allows the system to remain highly available even in the presence of network partitions, at the cost of potentially serving stale data for a period of time.

Inconsistency Resolution: Versioning

In distributed systems with eventual consistency, conflicts can arise when multiple clients update the same data concurrently. Versioning and vector clocks are techniques used to detect and resolve these conflicts.

Vector Clocks

A vector clock is a [server, version] pair associated with a data item. It can be used to track the history of updates and determine if one version precedes, succeeds, or conflicts with others.

A vector clock is represented as D([S1, v1], [S2, v2], …, [Sn, vn]), where:

  • D is a data item
  • S1, S2, ..., Sn are server identifiers
  • v1, v2, ..., vn are version counters

When data item D is written to server Si, the system:

  • Increments vi if [Si, vi] exists in the vector clock
  • Otherwise, creates a new entry [Si, 1]

Conflict Detection

Using vector clocks, we can determine relationships between versions:

No Conflict (Ancestor)

Version X is an ancestor of version Y if all counters in X are less than or equal to their corresponding counters in Y.

D([s0, 1], [s1, 1]) is an ancestor of D([s0, 1], [s1, 2])
Conflict (Sibling)

Versions X and Y conflict if each has at least one counter greater than the corresponding counter in the other.

D([s0, 1], [s1, 2]) conflicts with D([s0, 2], [s1, 1])

Limitations of Vector Clocks

Despite their usefulness, vector clocks have some drawbacks:

  • Client Complexity: The client needs to implement conflict resolution logic, making the client code more complex.
  • Size Growth: The [server, version] pairs in the vector clock could grow unbounded as more servers participate. To mitigate this, systems often set a threshold for the length and remove the oldest pairs if it exceeds the limit.

While the truncation approach can lead to some inefficiencies in reconciliation (as the complete history is lost), in practice, as noted in the Amazon Dynamo paper, this has not been a significant issue in production systems.

Conflict Scenario with Vector Clocks

Server 1

Server 2

Status

Changed values are highlighted in red

Key Takeaways

  • Quorum Consensus: Use W and R values to balance between consistency and latency. When W + R > N, strong consistency is guaranteed.
  • Consistency Models: Strong consistency offers the most accurate data but can impact availability. Eventual consistency favors availability but may temporarily serve stale data.
  • Vector Clocks: A mechanism to track the history of updates in a distributed system and detect conflicts between concurrent updates.
  • Conflict Resolution: In eventual consistency systems, conflicts need to be detected and resolved, often requiring application-specific logic.
  • CAP Theorem Trade-offs: Distributed systems must balance Consistency, Availability, and Partition tolerance. Data consistency mechanisms represent one aspect of these fundamental trade-offs.