How Peer Registry and Consul Service Discovery Ensure Node Availability in mpcium

The mpcium node relies on a centralized Consul KV store for service discovery and an in-process Peer Registry that aggregates peer readiness states and ECDH key-exchange completion to guarantee multi-party computation only proceeds when all participants are available.

mpcium is an open-source multi-party computation (MPC) framework that requires strict coordination across distributed nodes. To manage node availability and discovery, the codebase implements a two-tier architecture combining Consul service discovery for persistent state storage and a Peer Registry for real-time local state management. This design ensures that cryptographic protocols begin only after all peers have registered their readiness and completed secure handshakes.

Core Architecture Components

Consul KV Store as the Source of Truth

All peer identity and readiness information persists in HashiCorp Consul under specific key prefixes. The mpc_peers/ prefix stores the static list of participating nodes, while the mpc_ready/ prefix tracks dynamic readiness flags. This centralized approach allows nodes to discover each other without hardcoded configuration and provides a consistent view of cluster state across process restarts.

According to the mpcium source code in pkg/config/peers.go, the function LoadPeersFromConsul performs the initial discovery by listing all KV pairs under the mpc_peers/ prefix and constructing a slice of config.Peer structs containing node IDs and names.

Peer Registry for Local State Aggregation

While Consul provides persistence, the PeerRegistry struct in pkg/mpc/registry.go maintains the runtime state necessary for protocol execution. It tracks which peers have reported ready status, monitors ECDH key-exchange completion, and exposes thread-safe APIs for the rest of the system to query availability.

The registry is instantiated via NewRegistry in pkg/mpc/registry.go (lines 61-88), which accepts the list of peer IDs, a Consul KV interface, messaging channels, and an identity store. This constructor also initializes an ECDHSession to track cryptographic handshakes.

Node Availability Workflow

1. Loading Peers from Consul at Startup

When a node starts, the CLI entry point in cmd/mpcium/main.go (lines 475-483) creates a Consul client and loads the peer list before initializing the registry.

// cmd/mpcium/main.go
consulClient := infra.GetConsulClient(environment)
peers := config.LoadPeersFromConsul(consulClient)

The LoadPeersFromConsul function queries the KV store and returns the complete set of participants that the node should expect to join the computation.

2. Registering Node Readiness

After completing pre-parameter generation, a node marks itself as ready by calling registry.Ready(). This method writes a readiness record to Consul using the key pattern defined by ComposeReadyKey(nodeID), storing the node ID and ECDH exchange status.

// After cryptographic setup completes
if err := registry.Ready(); err != nil {
    logger.Fatal("cannot mark node ready", err)
}

The implementation in pkg/mpc/registry.go (lines 169-187) uses the ConsulKV.Put method to persist this state, making it visible to all other participants.

3. Watching for Peer Readiness

The registry spawns a long-running goroutine via WatchPeersReady() that performs blocking kv.List operations on the mpc_ready/ prefix. As shown in pkg/mpc/registry.go (lines 207-230), this watch loop detects changes in real-time and extracts ready peer IDs using getReadyPeersFromKVStore.

When the watch detects updates, it calls registerReadyPairs (lines 112-124) to update the internal readyMap (a map[string]bool tracking peerID → ready status) and increments readyCount.

4. Determining Global Readiness

The method checkAndUpdateReadyState (lines 140-156) implements the quorum logic by verifying two strict conditions:

  • All peers have written a ready flag: readyPeersCount == len(r.peerNodeIDs)
  • ECDH key-exchange with every peer is complete: r.isECDHReady()

Only when both conditions satisfy does the registry set r.ready = true, signaling that the node can safely participate in MPC protocols.

5. Exposing Availability to Consumers

The registry provides several query methods used by protocol workers and health checks:

  • ArePeersReady(): Returns true only when all peers report ready and ECDH is complete
  • AreMajorityReady(): Returns true when a quorum of peers is available
  • GetReadyPeersCount(): Returns the current count of ready participants

The health-check server in pkg/healthcheck/server.go (lines 107-123) consumes these APIs to expose node availability via HTTP endpoints.

Consul Client Implementation

Client Creation and Authentication

All Consul interactions route through infra.GetConsulClient, which configures TLS, authentication tokens, and basic auth credentials based on environment variables. The production configuration supports consul.token, consul.username, and consul.password via Viper configuration management.

// pkg/infra/consul.go
func GetConsulClient(environment string) *api.Client {
    config := api.DefaultConfig()
    if environment == constant.EnvProduction {
        config.Token = viper.GetString("consul.token")
        config.HttpAuth = &api.HttpBasicAuth{
            Username: viper.GetString("consul.username"),
            Password: viper.GetString("consul.password"),
        }
    }
    config.Address = viper.GetString("consul.address")
    client, err := api.NewClient(config)
    // validation and error handling...
    return client
}

KV Interface Abstraction

The ConsulKV interface defined in pkg/infra/consul.go abstracts the underlying Consul client to enable testing with mock implementations:

type ConsulKV interface {
    Put(kv *api.KVPair, options *api.WriteOptions) (*api.WriteMeta, error)
    Get(key string, options *api.QueryOptions) (*api.KVPair, *api.QueryMeta, error)
    Delete(key string, options *api.WriteOptions) (*api.WriteMeta, error)
    List(prefix string, options *api.QueryOptions) (api.KVPairs, *api.QueryMeta, error)
}

This abstraction allows the PeerRegistry to remain agnostic of the specific Consul client implementation while maintaining type safety.

Practical Usage Examples

Initializing the Registry in a Node

When constructing an MPC node, the registry requires the Consul KV implementation and communication channels:

// pkg/mpc/node.go
registry := mpc.NewRegistry(
    nodeID,
    peerIDs,
    consulClient.KV(),
    directMessaging,
    pubSub,
    identityStore,
)

Waiting for Peers in Key Generation

Protocol consumers poll the registry before commencing sensitive operations:

func (sc *keygenConsumer) waitForAllPeersReadyToGenKey(ctx context.Context) error {
    for {
        select {
        case <-ctx.Done():
            return ctx.Err()
        default:
            if sc.peerRegistry.ArePeersReady() {
                return nil // Proceed with key generation
            }
            time.Sleep(time.Second)
        }
    }
}

Health Check Endpoint Implementation

The HTTP health server exposes granular readiness information:

// pkg/healthcheck/server.go
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    peersReady := s.peerRegistry.ArePeersReady()
    majorityReady := s.peerRegistry.AreMajorityReady()
    readyCount := s.peerRegistry.GetReadyPeersCount()
    
    // Returns JSON with readiness state
}

Fail-Fast Behavior and Recovery

If a peer disappears or its Consul entry is deleted, the watch loop in WatchPeersReady detects the missing key through the blocking List operation. The registry immediately marks the peer as not ready in the readyMap, decrements the readyCount, and invokes checkAndUpdateReadyState to flip r.ready to false. When the peer re-registers its ready flag in Consul, the registry automatically transitions back to ready state without requiring a process restart, providing self-healing capabilities for transient network partitions.

Summary

  • Consul KV store acts as the persistent source of truth for peer identities (mpc_peers/) and readiness states (mpc_ready/).
  • LoadPeersFromConsul in pkg/config/peers.go performs initial peer discovery at node startup.
  • PeerRegistry in pkg/mpc/registry.go maintains runtime state, watches Consul for changes, and tracks ECDH completion.
  • Global readiness requires both that all peers have written ready flags to Consul and that ECDH key-exchange has completed with every participant.
  • Query methods like ArePeersReady() and AreMajorityReady() provide simple APIs for protocol workers to check availability.
  • Automatic recovery occurs when vanished peers re-register, allowing the system to heal without manual intervention.

Frequently Asked Questions

How does mpcium detect when a peer becomes unavailable?

The PeerRegistry runs a continuous watch loop via WatchPeersReady() that performs blocking kv.List operations on the mpc_ready/ prefix in Consul. When a peer's ready key disappears from the KV store, the watch detects the change and calls registerReadyPairs to remove the peer from the internal readyMap, immediately marking the global state as not ready until the peer returns.

What conditions must be met before mpcium starts a multi-party computation?

According to checkAndUpdateReadyState in pkg/mpc/registry.go, the registry sets r.ready = true only when two conditions satisfy simultaneously: all expected peers have written a ready flag to Consul (readyPeersCount == len(r.peerNodeIDs)), and the ECDH key-exchange has completed with every peer (r.isECDHReady() returns true).

Can the Peer Registry operate without Consul?

No, the PeerRegistry requires a Consul KV implementation as a mandatory dependency. The NewRegistry constructor accepts a ConsulKV interface, and methods like Ready() and WatchPeersReady() directly depend on Consul's atomic operations and blocking queries for correctness. However, the ConsulKV interface abstraction allows testing with mock implementations.

How does mpcium handle authentication with Consul?

The infra.GetConsulClient function configures authentication based on the environment. In production, it reads consul.token for ACL-based authentication and optionally sets consul.username and consul.password for HTTP basic auth. These credentials are injected via Viper configuration management, and the client validates connectivity to the Consul leader before returning.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →