Understanding etcd: The Distributed Key-Value Store

What is etcd?

etcd is a distributed, reliable key-value store developed by CoreOS that’s designed for shared configuration and service discovery. It is a central component in distributed systems and enables you to store and retrieve data across a cluster of machines.

Following are some characteristics of the etcd kv store.

Simple: well-defined, user-facing API (gRPC)
Secure: automatic TLS with optional client certificate authentication
Fast: benchmarked 10,000 writes/sec
Reliable: properly distributed using Raft protocol

Internal Design

etcd is a distributed key-value store that uses the Raft consensus algorithm for distributed system coordination. This ensures etcd maintains strong consistency and reliability in the face of network partitions. Here’s a brief overview of its internal design:

Key-value store: The primary data model is a key-value store. Keys are strings and the values can be arbitrary blobs of data.
Watch and lease abstraction: The API supports primitives like watches (subscribe to changes to a key or range of keys) and leases (automatically expire keys after a certain period). This can be used to build more complex distributed systems primitives.
Raft consensus algorithm: At its core, etcd uses the Raft consensus algorithm for managing a replicated log of commands across multiple servers. Raft ensures strong consistency and high availability.
gRPC API: etcd uses gRPC for its API. gRPC uses HTTP/2 for transport, and Protocol Buffers as the interface definition language, which makes the API accessible from many languages.

Where etcd can be used in building software applications?

Service Discovery: One of the common uses for etcd is storing the locations of service instances in a microservices architecture. Clients can use etcd to find the location of a service.
Configuration Management: etcd is useful for storing configuration data of a system. This configuration can be dynamically updated and since etcd uses the watch API, changes in the configuration are propagated to the clients in near real-time.
Leader Election: etcd can be used to implement leader election among a group of nodes in a cluster.
Distributed Locks: etcd can be used to implement distributed locks in a system which can be used to prevent concurrent access to shared resources.

Code example in Java

Java clients can use the Jetcd library to interact with etcd:

import io.etcd.jetcd.ByteSequence;
import io.etcd.jetcd.Client;
import io.etcd.jetcd.KV;
import io.etcd.jetcd.options.PutOption;

public class EtcdExample {
    public static void main(String[] args) {
        ByteSequence key = ByteSequence.from("test_key".getBytes());
        ByteSequence value = ByteSequence.from("test_value".getBytes());
        Client client = null;
        KV kvClient = null;

        try {
            client = Client.builder().endpoints("http://127.0.0.1:2379").build();
            kvClient = client.getKVClient();
            kvClient.put(key, value, PutOption.DEFAULT).get();
            System.out.println("Put successfully!");

            ByteSequence returnedValue = kvClient.get(key).get().getKvs().get(0).getValue();
            System.out.println("Get Value: " + returnedValue.toStringUtf8());

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            if(kvClient != null) {
                kvClient.close();
            }
            if(client != null) {
                client.close();
            }
        }
    }
}

Java

Line 14: This code will connect to an etcd instance running on localhost (127.0.0.1) and port 2379.
Line 16: It will then put a key-value pair into etcd,
Line 19: retrieve the value, and print it out. If there’s an error (for example, if etcd isn’t running or the key doesn’t exist), the code will print a stack trace.
Line 26: The finally block ensures the Client and KV instances are properly closed, even if an error occurs.

Alternative Tools

Other similar tools that can be used in a similar role include:

Zookeeper: It’s also a centralized service for maintaining configuration information, naming, and providing distributed synchronization. However, ZooKeeper uses an algorithm called Zab for replication, which is different from the Raft used by etcd. ZooKeeper also lacks a built-in mechanism for security (such as TLS).
Consul: It’s a distributed key-value store and has features for service discovery and health checking. It also uses the Raft consensus algorithm. Consul provides more features such as a DNS interface for service discovery and an extensive health checking system.
Redis: It’s primarily used as an in-memory data structure store and supports various data structures such as strings, hashes, lists, sets, etc. Redis can be configured to achieve distributed caching, but it’s not natively distributed or strongly consistent.

The difference comes down to specific features, ease of use, performance, and the specific use case you have. For example, etcd’s design, API simplicity and watch primitive can make it a good fit for many dynamic configuration use cases. On the other hand, if you need a DNS interface or extensive health checking system, you might prefer Consul.