1. Overview

Apache Cassandra is an open-source, NoSQL, highly available, and scalable distributed database. To achieve high availability, Cassandra relies on the replication of data across clusters.

In this tutorial, we will learn how Cassandra provides us the control to manage the consistency of data while replicating data for high availability.

2. Data Replication

Data replication refers to storing copies of each row in multiple nodes. The reason for data replication is to ensure reliability and fault tolerance. Consequently, if any node fails for any reason, the replication strategy makes sure that the same data is available in other nodes.

The replication factor (RF) specifies how many nodes across the cluster would store the replicas.

There are two available replication strategies:

The SimpleStrategy is used for a single data center and one rack topology. First, Cassandra uses partitioner logic to determine the node to place the row. Then it puts additional replicas on the next nodes clockwise in the ring.

The NetworkTopologyStrategy is generally used for multiple datacenters and multiple racks. Additionally, it allows you to specify a different replication factor for each data center. Within a data center, it allocates replicas to different racks to maximize availability.

3. Consistency Level

Consistency indicates how recent and in-sync all replicas of a row of data are. With the replication of data across the distributed system, achieving data consistency is a very complicated task.

Cassandra prefers availability over consistency. It doesn’t optimize for consistency. Instead, it gives you the flexibility to tune the consistency depending on your use case. In most use cases, Cassandra relies on eventual consistency.

Let’s look at consistency level impact during the write and read of data.

4. Consistency Level (CL) on Write

For write operations, the consistency level specifies how many replica nodes must acknowledge back before the coordinator successfully reports back to the client. More importantly, the number of nodes that acknowledge (for a given consistency level) and the number of nodes storing replicas (for a given RF) are mostly different.

For example, with the consistency level ONE and RF = 3, even though only one replica node acknowledges back for a successful write operation, Cassandra asynchronously replicates the data to 2 other nodes in the background.

Let’s look at some of the consistency level options available for the write operation to be successful.

The consistency level ONE means it needs acknowledgment from only one replica node. Since only one replica needs to acknowledge, the write operation is fastest in this case.

The consistency level QUORUM means it needs acknowledgment from 51% or a majority of replica nodes across all datacenters.

The consistency level of LOCAL_QUORUM means it needs acknowledgment from 51% or a majority of replica nodes just within the same datacenter as the coordinator. Thus, it avoids the latency of inter-datacenter communication.

The consistency level of ALL means it needs acknowledgment from all the replica nodes. Since all replica nodes need to acknowledge, the write operation is the slowest in this case. Moreover, if one of the replica nodes is down during the write operation, it fails, and availability suffers. Therefore, the best practice is not to use this option in production deployment.

We can configure the consistency level for each write query or at the global query level.

The diagram below shows a couple of examples of CL on write:

CassandraConsistency1

5. Consistency Level (CL) on Read

For read operations, the consistency level specifies how many replica nodes must respond with the latest consistent data before the coordinator successfully sends the data back to the client.

Let’s look at some of the consistency level options available for the read operation where Cassandra successfully returns data.

The consistency level ONE means only one replica node returns the data. The data retrieval is fastest in this case.

The consistency level QUORUM means 51% or a majority of replica nodes across all datacenters responds. Then the coordinator returns the data to the client. In the case of multiple data centers, the latency of inter-data center communication results in a slow read.

The consistency level of LOCAL_QUORUM means 51% or a majority of replica nodes within the same datacenter. As the coordinator responds, then the coordinator returns the data to the client. Thus, it avoids the latency of inter-datacenter communication.

The consistency level of ALL means all the replica nodes respond, then the coordinator returns the data to the client. Since all replica nodes need to acknowledge, the read operation is the slowest in this case. Moreover, if one of the replica nodes is down during the read operation, it fails, and availability suffers. The best practice is not to use this option in production deployment.

We can configure the consistency level for each write query or at the global query level.

The diagram below shows a couple of examples of CL on read:

CassandraConsistency2

6. Strong Consistency

Strong consistency means you are reading the latest written data into the cluster no matter how much time between the latest write and subsequent read.

We saw in the earlier sections how we could specify desired consistency level (CL) for writes and reads.

Strong consistency can be achieved if W + R > RF, where R – read CL replica count, W – write CL replica count, RF – replication factor.

In this scenario, you get a strong consistency since all client reads always fetches the most recent written data.

Let’s look at a couple of examples of strong consistency levels:

6.1. Write CL = QUORUM and Read CL = QUORUM

If RF = 3, W = QUORUM or LOCAL_QUORUM, R = QUORUM or LOCAL_QUORUM, then W (2) + R (2) > RF (3)

In this case, the write operation makes sure two replicas have the latest data. Then the read operation also makes sure it receives the data successfully only if at least two replicas respond with consistent latest data.

6.2. Write CL = ALL and Read CL = ONE

If RF = 3, W = ALL, R = ONE, then W (3) + R (1) > RF (3)

In this case, once the coordinator writes data to all the replicas, the write operation is successful. Then it’s enough to read the data from one of those replicas to make sure we read the latest written data.

But as we learned earlier, write CL of ALL is not fault-tolerant, and the availability suffers.

7. Conclusion

In this article, we looked at data replication in Cassandra. We also learned about the different consistency level options available on data write and read. Additionally, we looked at a couple of examples to achieve strong consistency.