Loading content...
When HashiCorp introduced Consul in 2014, they asked a provocative question: Why should service discovery, health checking, and coordination be separate tools?
ZooKeeper and etcd provide coordination primitives, but service discovery often requires additional layers — DNS servers, health check services, load balancers. Consul takes a different approach by providing an integrated platform that handles service discovery, health checking, key-value storage, and (with Consul Connect) a complete service mesh with mutual TLS and authorization.
This integration isn't just convenience — it's a fundamentally different architectural philosophy that trades simplicity for comprehensiveness.
By the end of this page, you will understand Consul's multi-datacenter architecture, its hybrid use of gossip (Serf) for membership and Raft for consensus, its first-class service discovery and health checking, the key-value store, and Consul Connect for service mesh. You'll also understand when Consul's integrated approach is the right choice.
Consul is a multi-cloud service networking platform that provides:
Unlike ZooKeeper and etcd, which are focused specifically on coordination, Consul aims to be a complete service networking solution.
| Feature | Consul | etcd | ZooKeeper |
|---|---|---|---|
| Service Discovery | Native, first-class | DIY with leases/watches | DIY with ephemeral znodes |
| Health Checking | Built-in (HTTP, TCP, Script, gRPC) | DIY with leases | DIY with sessions |
| Key-Value Store | Yes (coordination focus) | Yes (coordination focus) | Yes (znodes) |
| Service Mesh | Yes (Consul Connect) | No | No |
| Multi-Datacenter | Native WAN federation | Not designed for | Limited |
| DNS Interface | Built-in | No | No |
| UI | Yes | No (third-party) | No (third-party) |
| Consensus Protocol | Raft | Raft | ZAB |
| Membership Protocol | Gossip (Serf) | Raft only | Session-based |
The Consul Architecture Philosophy
Consul's design centers on a key insight: service networking is a graph problem. Services need to discover each other, verify each other's health, authenticate each other, and authorize communication. By owning the entire graph, Consul can provide features that would require integrating multiple tools:
Consul's integrated approach shines when you need the full stack: discovery, health checking, configuration, and secure communication. If you only need a key-value store for configuration, etcd is simpler. If you're already running Kubernetes, Kubernetes-native solutions may be preferred. Consul excels in multi-cloud and VM-based environments.
Consul uses a hybrid architecture with two distinct protocols working together:
This separation allows Consul to scale to large cluster sizes while maintaining strong consistency where needed.
Server Nodes
Server nodes form the brain of a Consul datacenter:
Client Nodes (Agents)
Client agents run on every node that runs services:
Two Gossip Pools
Consul maintains two separate gossip pools:
LAN Gossip: All agents in a datacenter (clients and servers) gossip with each other. Used for membership and local event broadcast. Fast convergence, optimized for low-latency networks.
WAN Gossip: Only servers participate, across datacenters. Used for cross-datacenter health checking and routing. Tuned for higher latency, lower bandwidth.
Raft requires all writes to go through a single leader. With thousands of nodes joining, leaving, and failing, Raft would become a bottleneck. Gossip distributes the membership load: each node only talks to a few neighbors, and information propagates exponentially. This is why Consul can scale to 10,000+ nodes per datacenter.
Service discovery is Consul's core strength. Unlike etcd and ZooKeeper where you build service discovery on top of primitives, Consul provides it as a first-class feature with built-in health checking, DNS interface, and prepared queries.
Service Registration
Services register with their local Consul agent (not directly with servers). The agent handles:
1234567891011121314151617181920212223
{ "service": { "name": "web-api", "id": "web-api-10.0.0.1:8080", "port": 8080, "address": "10.0.0.1", "tags": ["production", "v2"], "meta": { "version": "2.0.0", "team": "backend" }, "check": { "http": "http://10.0.0.1:8080/health", "interval": "10s", "timeout": "5s", "deregister_critical_service_after": "1m" }, "weights": { "passing": 10, "warning": 1 } }}Health Check Types
Consul supports multiple health check types:
Health checks run locally on the agent, minimizing network overhead and enabling sub-second detection.
123456789101112131415161718192021222324252627282930313233
# DNS-based discovery (simple, works with any application)$ dig @127.0.0.1 -p 8600 web-api.service.consul;; ANSWER SECTION:web-api.service.consul. 0 IN A 10.0.0.1web-api.service.consul. 0 IN A 10.0.0.2web-api.service.consul. 0 IN A 10.0.0.3 # SRV records include port information$ dig @127.0.0.1 -p 8600 web-api.service.consul SRV;; ANSWER SECTION:web-api.service.consul. 0 IN SRV 1 1 8080 10.0.0.1.node.dc1.consul.web-api.service.consul. 0 IN SRV 1 1 8080 10.0.0.2.node.dc1.consul. # Filter by tag$ dig @127.0.0.1 -p 8600 v2.web-api.service.consul # Cross-datacenter query$ dig @127.0.0.1 -p 8600 web-api.service.dc2.consul # HTTP API - more details including health$ curl http://localhost:8500/v1/health/service/web-api?passing=true[ { "Node": {"Node": "node1", "Address": "10.0.0.1"}, "Service": {"Service": "web-api", "Port": 8080}, "Checks": [ {"Status": "passing", "Output": "HTTP GET http://...: 200 OK"} ] }] # Catalog query (no health filtering, just registered services)$ curl http://localhost:8500/v1/catalog/service/web-apiConsul's DNS interface is uniquely powerful: any application that can resolve DNS can use Consul for service discovery without code changes. Point your DNS resolver at Consul's port 8600 (or use dnsmasq to forward .consul queries), and 'web-api.service.consul' just works.
Consul's key-value store provides the same coordination capabilities as etcd and ZooKeeper: configuration storage, distributed locks, leader election, and semaphores. While not as feature-rich as dedicated coordination services, it's sufficient for many use cases and benefits from Consul's operational simplicity.
123456789101112131415161718192021222324252627282930313233
# Put a key$ consul kv put config/database/connection-string "postgres://..."Success! Data written to: config/database/connection-string # Get a key$ consul kv get config/database/connection-stringpostgres://... # Get with metadata$ consul kv get -detailed config/database/connection-stringCreateIndex 42Flags 0Key config/database/connection-stringLockIndex 0ModifyIndex 42Session (none)Value postgres://... # List keys with prefix$ consul kv get -recurse config/config/database/connection-string:postgres://...config/database/pool-size:10config/features/dark-mode:true # Delete$ consul kv delete config/old-key # Delete prefix (recursive)$ consul kv delete -recurse config/deprecated/ # Export/Import for backup$ consul kv export config/ > config-backup.json$ consul kv import @config-backup.jsonCheck-And-Set (CAS)
Consul supports optimistic locking via Check-And-Set operations. The ModifyIndex field serves as a version number:
123456789101112131415
# Get current value and ModifyIndex$ curl -s http://localhost:8500/v1/kv/config/counter | jq[{ "Key": "config/counter", "Value": "MTAw", # base64 of "100" "ModifyIndex": 42}] # CAS: only update if ModifyIndex still matches$ curl -X PUT -d "101" "http://localhost:8500/v1/kv/config/counter?cas=42"true # Success - value was updated # If someone else modified it first:$ curl -X PUT -d "101" "http://localhost:8500/v1/kv/config/counter?cas=42"false # Failed - ModifyIndex changed, retry neededSessions and Locks
Consul provides a session abstraction for implementing ephemeral keys and distributed locks. Sessions have configurable behaviors on failure:
Sessions are tied to a node and have TTL-based expiration.
123456789101112131415161718192021222324
# Create a session with 30s TTL$ curl -X PUT -d '{"Name": "my-session", "TTL": "30s", "Behavior": "release"}' http://localhost:8500/v1/session/create{"ID": "adf4238a-882b-9ddc-4a9d-5b6758e4159e"} # Acquire a lock using the session$ curl -X PUT -d "my-lock-value" "http://localhost:8500/v1/kv/locks/myresource?acquire=adf4238a-882b-9ddc-4a9d-5b6758e4159e"true # Lock acquired # Check who holds the lock$ consul kv get -detailed locks/myresource...Session adf4238a-882b-9ddc-4a9d-5b6758e4159e... # Another session trying to acquire will fail$ curl -X PUT -d "other-value" "http://localhost:8500/v1/kv/locks/myresource?acquire=other-session-id"false # Already locked # Release the lock$ curl -X PUT "http://localhost:8500/v1/kv/locks/myresource?release=adf4238a-882b-9ddc-4a9d-5b6758e4159e"true # Renew session before TTL expires (keep-alive)$ curl -X PUT "http://localhost:8500/v1/session/renew/adf4238a-882b-9ddc-4a9d-5b6758e4159e"Consul's KV is simpler than etcd's transaction system and ZooKeeper's watch model. It lacks multi-key transactions and has less sophisticated watch semantics. For complex coordination, dedicated stores may be better. Consul KV excels when you need simple configuration storage alongside Consul's service discovery.
Consul Connect is Consul's service mesh feature, providing:
This is where Consul differentiates most from etcd and ZooKeeper — they're coordination services, while Consul is a complete service networking solution.
How Connect Works
Connect can operate in two modes:
Sidecar Proxy: Each service gets a proxy (typically Envoy) that handles mTLS and traffic. This is transparent to the application.
Native Integration: Applications use Consul's Connect SDK to handle mTLS directly. Lower overhead but requires code changes.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
# Service mesh intentions - who can talk to whom # Allow web-frontend to access web-apiKind = "service-intentions"Name = "web-api"Sources = [ { Name = "web-frontend" Action = "allow" }, { Name = "monitoring" Action = "allow" }, # Deny all other services (default deny) { Name = "*" Action = "deny" }] # More complex: L7 routing based on HTTP pathsKind = "service-intentions"Name = "payment-api"Sources = [ { Name = "order-service" Permissions = [ { Action = "allow" HTTP { PathPrefix = "/v2/" Methods = ["GET", "POST"] } }, { Action = "deny" HTTP { PathPrefix = "/admin/" } } ] }]All are service mesh solutions, but Consul is uniquely designed for multi-cloud and VM environments. Istio and Linkerd are Kubernetes-focused. Consul's hybrid approach (VMs + Kubernetes + cloud) and its integration with HashiCorp tools (Vault, Terraform, Nomad) make it attractive for heterogeneous environments.
One of Consul's most distinctive features is first-class multi-datacenter support. While etcd and ZooKeeper clusters are typically confined to a single datacenter, Consul is designed from the ground up for global deployments.
How Federation Works
1234567891011121314151617181920212223242526272829
# Query service in a specific datacenter$ dig @127.0.0.1 -p 8600 web-api.service.us-west-2.consul$ dig @127.0.0.1 -p 8600 web-api.service.eu-central-1.consul # Prepared query with failover$ curl -X POST -d '{ "Name": "web-api-failover", "Service": { "Service": "web-api", "OnlyPassing": true, "Failover": { "NearestN": 2, "Datacenters": ["us-west-2", "us-east-1", "eu-central-1"] } }}' http://localhost:8500/v1/query # Query by prepared query name$ dig @127.0.0.1 -p 8600 web-api-failover.query.consul # Lists all datacenters$ consul catalog datacentersdc1dc2us-west-2eu-central-1 # KV data is NOT automatically replicated across datacenters# For cross-DC KV, manually replicate or use consul-replicate toolUnlike services, Consul's KV store is NOT automatically replicated across datacenters. Each datacenter has independent KV data. For cross-DC configuration, use tools like consul-replicate or store configuration in each datacenter separately. This is intentional—cross-DC Raft would introduce unacceptable latency.
| Feature | Cross-DC Behavior |
|---|---|
| Service Discovery | Yes - can query services in any federated DC |
| Health Checks | Yes - WAN gossip propagates health status |
| Prepared Queries | Yes - failover and nearest routing work cross-DC |
| KV Store | No automatic replication - per-DC data |
| ACL Tokens | Replicated from primary DC (ACL replication) |
| Intentions | Replicated across DCs for mesh connectivity |
| Raft Consensus | Independent per-DC - no cross-DC Raft |
Mesh Gateways
For Connect traffic across datacenters, Consul uses mesh gateways:
Consul's comprehensive feature set comes with operational considerations. Here's what you need to know for production deployments.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
# Production server configuration exampledatacenter = "us-west-2"data_dir = "/opt/consul/data"log_level = "INFO" server = truebootstrap_expect = 3 # Bind addressesbind_addr = "10.0.0.1"client_addr = "0.0.0.0" # TLS configurationtls { defaults { verify_incoming = true verify_outgoing = true ca_file = "/etc/consul.d/certs/consul-ca.pem" cert_file = "/etc/consul.d/certs/consul-server.pem" key_file = "/etc/consul.d/certs/consul-server-key.pem" }} # Gossip encryptionencrypt = "base64-encoded-32-byte-key" # ACL configurationacl { enabled = true default_policy = "deny" enable_token_persistence = true} # Performance tuningperformance { raft_multiplier = 1 # Lower = faster, but more sensitive to slowness} # Autopilot for automatic server managementautopilot { cleanup_dead_servers = true last_contact_threshold = "200ms" max_trailing_logs = 250 server_stabilization_time = "10s"}Consul's Autopilot feature automatically manages server lifecycle: dead server cleanup, server health monitoring, and stable leader election. It's especially useful for auto-scaling environments where servers come and go.
We've covered Consul's integrated approach to service networking. Let's consolidate the essential knowledge:
What's Next:
In the next page, we'll systematically compare ZooKeeper, etcd, and Consul to help you choose the right coordination service for your specific needs. You'll learn the key criteria for making this architectural decision.
You now understand Consul's integrated approach to service networking, its hybrid architecture with gossip and Raft, and when its comprehensive feature set is the right choice. This foundation prepares you to make informed comparisons between coordination services.