1. Introduction

In the complex world of distributed systems, ensuring efficient data management is crucial. Distributed reliable key-value stores play a pivotal role in maintaining data consistency and scalability across distributed environments.

In this comprehensive tutorial, we’ll delve into etcd, an open-source distributed key-value store. We’ll explore its fundamental concepts, features, and use cases, and provide a hands-on quickstart guide. Finally, we’ll compare etcd with a couple of other distributed key-value stores to understand its strengths and unique offerings.

2. What Are Distributed Key-Value Stores?

Distributed key-value stores are a type of NoSQL database that stores data as key-value pairs that span multiple physical or virtual machines.

This distribution essentially enhances scalability, fault tolerance, and performance. Moreover, each piece of data (value) is associated with a unique identifier (key). This model is highly efficient for certain use cases, such as caching, configuration management, and fast data retrieval.

Apache Zookeeper, Consul, and Redis are some of the examples that provide a reliable key-value store.

Distributed key-value stores serve as the backbone of many distributed systems, providing a simple yet powerful mechanism for storing and retrieving data.

Below are some important key aspects of the distributed key-value stores:

  • Simplicity: Basic data structure comprising key-value pairs, making it easy to understand and use for specific types of applications.
  • Scalability: These systems can efficiently handle growing amounts of data and increased load by distributing the workload across multiple nodes.
  • Reliability: They ensure data consistency, fault tolerance, and scalability.
  • Performance: The key-value mechanism provides fast and efficient access to data. Moreover, by distributing it across multiple nodes, it reduces the load on individual machines.
  • Distribution: Since the data is spread across multiple nodes, we get enhanced performance.

Distributed key-value stores find applications in various scenarios, such as configuration management, caching, session storage, service discovery, leader election, etc.

3. What Is etcd?

etcd is a distributed, reliable key-value store for the most critical data of a distributed system. It’s a simple, secure, fast, and reliable key-value store designed for configuration management, service discovery, and coordination of distributed systems.

Developed by the CoreOS team and now a CNCF (Cloud Native Computing Foundation) project, etcd provides a reliable and distributed data store that enables the coordination of configurations and the discovery of services in dynamic and scalable environments.

etcd is developed in Go and internally uses the Raft consensus algorithm to manage a highly-available replicated log.

Many companies worldwide such as Baidu, Huawei, Salesforce, Ticketmaster, etc. use etcd in production. It’s frequently integrated with applications such as Kubernetes, Locksmith, Vulcand, Doorman, and many others.

etcd’s rich feature set makes it a versatile and reliable choice for distributed systems, providing the essential building blocks for configuration management, service discovery, and coordination in cloud-native environments. Its commitment to distributed consistency, high availability, and strong data integrity positions it as a foundational component in the landscape of modern, scalable, and resilient applications.

4. Features of etcd

etcd’s rich feature set makes it a versatile and reliable choice for distributed systems, providing the essential building blocks for configuration management, service discovery, and coordination in cloud-native environments. In certain situations, it may achieve 10,000 writes/sec.

Let’s understand some of its key features:

  • HTTP/gRPC API: etcd provides both HTTP and gRPC APIs, making it accessible and interoperable with various programming languages and easily integrated into different types of applications and frameworks.
  • Distributed Consistency: It maintains strong consistency in distributed setups, ensuring that all nodes in the cluster have a consistent view of the data.
  • High Availability: etcd is designed to be highly available, with automatic leader election and failover mechanisms. Thus, an etcd cluster remains operational even in the face of node failures, contributing to system resilience.
  • Watch Support: etcd supports strongly consistent watches, allowing applications to monitor changes to specific key-value stores in real-time.
  • Atomic Transactions: It supports atomic transactions, allowing multiple key-value operations that we can group together and execute as a single atomic unit, thus maintaining data consistency.
  • Lease Management: etcd introduces the concept of leases, allowing keys to have associated time-to-live (TTL) values thus deleting them automatically after the specified period.
  • Role-Based Access Control (RBAC): It supports RBAC, allowing administrators to define roles and permissions for users and applications interacting with the cluster.
  • Snapshot and Backup: It provides mechanisms for creating snapshots of the cluster’s state and supports backup and restoration processes. Thus, it ensures disaster recovery and data durability.
  • Pluggable Storage Backend: etcd offers a pluggable storage backend, enabling users to choose the underlying storage engine that best fits their requirements (e.g., etcd’s default storage engine, LevelDB, or RocksDB). Thus, it provides flexibility and allows optimization based on specific use cases and performance considerations.
  • Integration with Kubernetes: etcd is a critical component in Kubernetes, serving as the primary datastore for configuration and state information. This makes etcd a core part of container orchestration, ensuring that the distributed systems can manage configurations and scale effectively.
  • etcdctl: It’s a command-line client tool designed for interacting with and managing an etcd cluster.

5. Installation

Let’s understand how to configure and set up etcd to get it running. etcd is compatible with Linux distributions like Ubuntu, CentOS, and also Windows.

We can start by updating the package list on Ubuntu:

$ sudo apt update

Subsequently, we can install etcd:

$ sudo apt install etcd

Similarly, on CentOS, we first need to enable the EPEL repository and then install etcd:

$ sudo yum install epel-release
$ sudo yum install etcd

Alternatively, we can visit the official etcd GitHub releases page to download the latest release. Otherwise, we can clone the repo using the following command:

$ git clone -b v3.5.11 https://github.com/etcd-io/etcd.git

For cloning the latest version, we can omit the -b v3.5.11 flag.

Then, we can extract the downloaded archive and navigate to the etcd directory:

$ tar xvf etcd-v3.5.11-linux-amd64.tar.gz
$ cd etcd

Next, we can run the build script:

$ ./build.sh

We can find the binaries under the bin directory. We then need to add the full path to the bin directory to our path:

$ export PATH="$PATH:`pwd`/bin"

Here, pwd is the UNIX command that gets us the full path name of the current directory. Finally, we can ensure that our PATH contains etcd by checking the version:

$ etcd --version

6. Configuration Using  the Config File

We have multiple options to configure etcd. However, in this tutorial, we’ll create a configuration file with basic settings.

The etcd configuration file is a YAML file that contains settings and parameters used to configure the behavior of an etcd node. This file is essential for customizing various aspects of etcd, such as network settings, cluster information, authentication, and storage options. Let’s see an example:

# Example etcd-config.yml
# Node name, a unique identifier, in the etcd cluster
name: node-1

# Data directory where etcd will store its data
data-dir: /var/lib/etcd/default.etcd

# Listen addresses for client communication
listen-client-urls: http://127.0.0.1:2379,http://<NODE-IP>:2379

# Advertise addresses for client communication
advertise-client-urls: http://<NODE-IP>:2379

# Listen addresses for peer communication
listen-peer-urls: http://<NODE-IP>:2380

# Advertise addresses for peer communication
initial-advertise-peer-urls: http://<NODE-IP>:2380

# Initial cluster configuration
initial-cluster: node-1=http://<NODE-IP>:2380,node-2=http://<NODE-IP>:2380

# Unique token for the etcd cluster
initial-cluster-token: etcd-cluster-1

# Initial cluster state (new, existing, or standby)
initial-cluster-state: new

# Enable authentication with a shared secret token
auth-token: "some-secret-token"

# Enable authorization with RBAC
enable-authorization: true

# Enable automatic compaction of the etcd key-value store
auto-compaction-mode: periodic
auto-compaction-retention: "1h"

# Secure communication settings (TLS)
client-transport-security:
  cert-file: /etc/etcd/server.crt
  key-file: /etc/etcd/server.key
  client-cert-auth: true
  trusted-ca-file: /etc/etcd/ca.crt

peer-transport-security:
  cert-file: /etc/etcd/peer.crt
  key-file: /etc/etcd/peer.key
  client-cert-auth: true
  trusted-ca-file: /etc/etcd/ca.crt

Let’s understand a few important notes about this configuration:

Adding TLS Certificates: secure configurations (client-transport-security and peer-transport-security) are optional but recommended for production deployments, providing encrypted communication.

Adding RBAC: Role-Based Access Control adds a layer of security by controlling access to etcd operations based on user roles and permissions.

Enabling auto-compaction: Helps manage the size of the etcd data store by periodically (hourly) removing unnecessary data.

Finally, we should ensure that we customize the configuration file based on our specific requirements and security considerations. After editing the file, we can restart the etcd service for the changes to take effect.

7. Starting and Interacting With etcd

We can start etcd with the specified configuration using the following command:

$ ./etcd --config-file=etcd-config.yml

Further, we can interact with etcd using the etcdctl command-line tool that’s designed for interacting with and managing an etcd cluster. It facilitates administrators and developers in executing various operations on an etcd cluster directly from the command line.

Let’s understand with a few examples:

We can set a key-value pair as:

$ etcdctl put mykey "Hello, etcd!"

Here, mykey is the key, and “Hello, etcd!” is the corresponding value. Subsequently, we can retrieve the value of mykey as:

$ etcdctl get mykey
mykey
Hello, etcd!

To watch changes to mykey, we can simply do:

$ etcdctl watch mykey

Watching a key in etcd allows us to receive real-time notifications about changes to the key, whether the value is modified or the key is deleted. Watch events provide details about the nature of the change, enabling applications to react dynamically to the updates in the etcd key-value store.

It’s important to note that watching a key doesn’t prevent it from being deleted. Watches are mechanisms for observing changes, not for controlling or restricting them.

Finally, we can use the following command to check the health of the etcd cluster:

$ etcdctl endpoint health

If we’re working with a secured etcd cluster, then we may need to provide additional authentication and security options, such as specifying the –cacert, –cert, and –key flags to point to the certificate and key files while checking the health.

8. Code Example

To interact with etcd using Java, we can use a Java client library like jetcd or etcd4j. In our example, we’ll use jetcd since it’s the official Java client for etcd v3.

jetcd is built upon Java 11. It facilitates all key-based etcd requests and offers SSL security. Moreover, it allows the definition of multiple connection URLs and provides both synchronous and asynchronous APIs, giving us flexibility in choosing the programming model that best fits our application.

We can add the jetcd-core dependency to our project as:

<dependency>
    <groupId>io.etcd</groupId>
    <artifactId>jetcd-core</artifactId>
    <version>0.7.7</version>
</dependency>

Now, let’s see a basic example demonstrating the put, retrieve, and delete operations using jetcd:

public class JetcdExample {
    public static void main(String[] args) {
        String etcdEndpoint = "http://localhost:2379";
        ByteSequence key = ByteSequence.from("/mykey".getBytes());
        ByteSequence value = ByteSequence.from("Hello, etcd!".getBytes());

        try (Client client = Client.builder().endpoints(etcdEndpoint).build()) {
            KV kvClient = client.getKVClient();
            
            // Put a key-value pair
            kvClient.put(key, value).get();
            
            // Retrieve the value using CompletableFuture
            CompletableFuture<GetResponse> getFuture = kvClient.get(key);
            GetResponse response = getFuture.get();
            
            // Delete the key
            kvClient.delete(key).get();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

9. Comparison With Apache Zookeeper and Consul

As distributed system tools, etcd, Apache Zookeeper, and Consul are designed to manage configurations, coordinate, and provide a reliable foundation for building distributed applications. However, they have significant differences in their design philosophies, architecture, and use cases:

 

Feature/Aspect

etcd

Apache ZooKeeper

Consul

Consensus Algorithm

Raft

Zab (ZooKeeper Atomic Broadcast)

Consul Raft

Data Model

key-value Store

Hierarchy of ZNodes

key-value Store

Use Cases

Cloud-native, Kubernetes

Various distributed systems

Service discovery, networking

Consistency Model

Strong consistency

Strong consistency

Consistent, eventually consistent

Security Features

TLS support, AuthN, and AuthZ

Limited built-in security

ACLs, TLS, Token-based access

Leadership Election

Leader election is inherent in Raft consensus. Nodes participate in elections for leader selection.

Centralized leader election through Zab protocol. Nodes elect a leader that coordinates operations.

Raft-based leadership election. Each Consul server participates in the Raft consensus algorithm for leader election.

Leader Characteristics

The leader holds authority for making decisions and coordinating the cluster.

The leader manages the distributed system’s state and coordinates actions.

The leader is responsible for cluster coordination and decision-making.

Performance

Generally good

Good, used in large deployments

High-performance, scalable

Integration with Ecosystem

Integrates with CNCF projects

Integrated with Apache projects

Integrates with HashiCorp stack

Monitoring & Observability

etcd metrics, Prometheus support

Limited built-in monitoring

Integrated metrics, Prometheus

Configuration Management

Configuration API

Used for configuration in Hadoop, Kafka, etc.

Dynamic configuration management

Service Discovery

Limited

Used as part of distributed systems

Core feature, DNS-based discovery

Commercial Support

Limited

Commercial support available

Enterprise and open-source offerings

Ease of Use

Known for simplicity

Can be more complex

Easy to use and configure

License

Apache License 2.0

Apache License 2.0

MPLv2.0

Choosing between etcd, Apache ZooKeeper, and Consul depends on specific project needs.

etcd, with its simplicity and Cloud Native Computing Foundation (CNCF) support, suits cloud-native environments like Kubernetes. Apache ZooKeeper, a robust choice for large-scale deployments, offers strong consistency but comes with added complexity. On the other hand, Consul, known for simplicity and effective service discovery, integrates seamlessly with the HashiCorp stack.

Security, ease of use, and integration requirements play pivotal roles in the decision-making process. Each tool has its strengths, therefore making an informed selection is crucial for us based on the desired features and use cases.

10. Conclusion

In this article, we’ve explored etcd comprehensively, discussing its foundational concepts, critical features, and practical applications. The quick start guide will help us set up etcd quickly and interact with it programmatically. Additionally, the comparison with other distributed key-value stores highlights the unique strengths of etcd, making it a reliable choice for various distributed system scenarios.

Understanding distributed reliable key-value stores, the criticality of data in distributed systems, and the capabilities of etcd will help us make informed decisions when designing and implementing distributed applications. Finally, as the backbone of many distributed systems, etcd’s simplicity, consistency, and high availability make it a valuable tool for developers navigating the complexities of distributed environments.

As always, the source code accompanying the article is available over on GitHub.