# How Peer Registry and Consul Service Discovery Ensure Node Availability in mpcium

> Learn how mpcium uses Peer Registry and Consul service discovery to ensure node availability for secure multi-party computation. Guarantee readiness before proceeding.

- Repository: [Fystack Labs/mpcium](https://github.com/fystack/mpcium)
- Tags: architecture
- Published: 2026-03-02

---

**The mpcium node relies on a centralized Consul KV store for service discovery and an in-process Peer Registry that aggregates peer readiness states and ECDH key-exchange completion to guarantee multi-party computation only proceeds when all participants are available.**

mpcium is an open-source multi-party computation (MPC) framework that requires strict coordination across distributed nodes. To manage node availability and discovery, the codebase implements a two-tier architecture combining **Consul service discovery** for persistent state storage and a **Peer Registry** for real-time local state management. This design ensures that cryptographic protocols begin only after all peers have registered their readiness and completed secure handshakes.

## Core Architecture Components

### Consul KV Store as the Source of Truth

All peer identity and readiness information persists in HashiCorp Consul under specific key prefixes. The `mpc_peers/` prefix stores the static list of participating nodes, while the `mpc_ready/` prefix tracks dynamic readiness flags. This centralized approach allows nodes to discover each other without hardcoded configuration and provides a consistent view of cluster state across process restarts.

According to the mpcium source code in [`pkg/config/peers.go`](https://github.com/fystack/mpcium/blob/main/pkg/config/peers.go), the function `LoadPeersFromConsul` performs the initial discovery by listing all KV pairs under the `mpc_peers/` prefix and constructing a slice of `config.Peer` structs containing node IDs and names.

### Peer Registry for Local State Aggregation

While Consul provides persistence, the `PeerRegistry` struct in [`pkg/mpc/registry.go`](https://github.com/fystack/mpcium/blob/main/pkg/mpc/registry.go) maintains the runtime state necessary for protocol execution. It tracks which peers have reported ready status, monitors ECDH key-exchange completion, and exposes thread-safe APIs for the rest of the system to query availability.

The registry is instantiated via `NewRegistry` in [`pkg/mpc/registry.go`](https://github.com/fystack/mpcium/blob/main/pkg/mpc/registry.go) (lines 61-88), which accepts the list of peer IDs, a Consul KV interface, messaging channels, and an identity store. This constructor also initializes an `ECDHSession` to track cryptographic handshakes.

## Node Availability Workflow

### 1. Loading Peers from Consul at Startup

When a node starts, the CLI entry point in [`cmd/mpcium/main.go`](https://github.com/fystack/mpcium/blob/main/cmd/mpcium/main.go) (lines 475-483) creates a Consul client and loads the peer list before initializing the registry.

```go
// cmd/mpcium/main.go
consulClient := infra.GetConsulClient(environment)
peers := config.LoadPeersFromConsul(consulClient)

```

The `LoadPeersFromConsul` function queries the KV store and returns the complete set of participants that the node should expect to join the computation.

### 2. Registering Node Readiness

After completing pre-parameter generation, a node marks itself as ready by calling `registry.Ready()`. This method writes a readiness record to Consul using the key pattern defined by `ComposeReadyKey(nodeID)`, storing the node ID and ECDH exchange status.

```go
// After cryptographic setup completes
if err := registry.Ready(); err != nil {
    logger.Fatal("cannot mark node ready", err)
}

```

The implementation in [`pkg/mpc/registry.go`](https://github.com/fystack/mpcium/blob/main/pkg/mpc/registry.go) (lines 169-187) uses the `ConsulKV.Put` method to persist this state, making it visible to all other participants.

### 3. Watching for Peer Readiness

The registry spawns a long-running goroutine via `WatchPeersReady()` that performs blocking `kv.List` operations on the `mpc_ready/` prefix. As shown in [`pkg/mpc/registry.go`](https://github.com/fystack/mpcium/blob/main/pkg/mpc/registry.go) (lines 207-230), this watch loop detects changes in real-time and extracts ready peer IDs using `getReadyPeersFromKVStore`.

When the watch detects updates, it calls `registerReadyPairs` (lines 112-124) to update the internal `readyMap` (a `map[string]bool` tracking peerID → ready status) and increments `readyCount`.

### 4. Determining Global Readiness

The method `checkAndUpdateReadyState` (lines 140-156) implements the quorum logic by verifying two strict conditions:

- **All peers have written a ready flag**: `readyPeersCount == len(r.peerNodeIDs)`
- **ECDH key-exchange with every peer is complete**: `r.isECDHReady()`

Only when both conditions satisfy does the registry set `r.ready = true`, signaling that the node can safely participate in MPC protocols.

### 5. Exposing Availability to Consumers

The registry provides several query methods used by protocol workers and health checks:

- `ArePeersReady()`: Returns true only when all peers report ready and ECDH is complete
- `AreMajorityReady()`: Returns true when a quorum of peers is available
- `GetReadyPeersCount()`: Returns the current count of ready participants

The health-check server in [`pkg/healthcheck/server.go`](https://github.com/fystack/mpcium/blob/main/pkg/healthcheck/server.go) (lines 107-123) consumes these APIs to expose node availability via HTTP endpoints.

## Consul Client Implementation

### Client Creation and Authentication

All Consul interactions route through `infra.GetConsulClient`, which configures TLS, authentication tokens, and basic auth credentials based on environment variables. The production configuration supports `consul.token`, `consul.username`, and `consul.password` via Viper configuration management.

```go
// pkg/infra/consul.go
func GetConsulClient(environment string) *api.Client {
    config := api.DefaultConfig()
    if environment == constant.EnvProduction {
        config.Token = viper.GetString("consul.token")
        config.HttpAuth = &api.HttpBasicAuth{
            Username: viper.GetString("consul.username"),
            Password: viper.GetString("consul.password"),
        }
    }
    config.Address = viper.GetString("consul.address")
    client, err := api.NewClient(config)
    // validation and error handling...
    return client
}

```

### KV Interface Abstraction

The `ConsulKV` interface defined in [`pkg/infra/consul.go`](https://github.com/fystack/mpcium/blob/main/pkg/infra/consul.go) abstracts the underlying Consul client to enable testing with mock implementations:

```go
type ConsulKV interface {
    Put(kv *api.KVPair, options *api.WriteOptions) (*api.WriteMeta, error)
    Get(key string, options *api.QueryOptions) (*api.KVPair, *api.QueryMeta, error)
    Delete(key string, options *api.WriteOptions) (*api.WriteMeta, error)
    List(prefix string, options *api.QueryOptions) (api.KVPairs, *api.QueryMeta, error)
}

```

This abstraction allows the `PeerRegistry` to remain agnostic of the specific Consul client implementation while maintaining type safety.

## Practical Usage Examples

### Initializing the Registry in a Node

When constructing an MPC node, the registry requires the Consul KV implementation and communication channels:

```go
// pkg/mpc/node.go
registry := mpc.NewRegistry(
    nodeID,
    peerIDs,
    consulClient.KV(),
    directMessaging,
    pubSub,
    identityStore,
)

```

### Waiting for Peers in Key Generation

Protocol consumers poll the registry before commencing sensitive operations:

```go
func (sc *keygenConsumer) waitForAllPeersReadyToGenKey(ctx context.Context) error {
    for {
        select {
        case <-ctx.Done():
            return ctx.Err()
        default:
            if sc.peerRegistry.ArePeersReady() {
                return nil // Proceed with key generation
            }
            time.Sleep(time.Second)
        }
    }
}

```

### Health Check Endpoint Implementation

The HTTP health server exposes granular readiness information:

```go
// pkg/healthcheck/server.go
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    peersReady := s.peerRegistry.ArePeersReady()
    majorityReady := s.peerRegistry.AreMajorityReady()
    readyCount := s.peerRegistry.GetReadyPeersCount()
    
    // Returns JSON with readiness state
}

```

## Fail-Fast Behavior and Recovery

If a peer disappears or its Consul entry is deleted, the watch loop in `WatchPeersReady` detects the missing key through the blocking `List` operation. The registry immediately marks the peer as not ready in the `readyMap`, decrements the `readyCount`, and invokes `checkAndUpdateReadyState` to flip `r.ready` to false. When the peer re-registers its ready flag in Consul, the registry automatically transitions back to ready state without requiring a process restart, providing self-healing capabilities for transient network partitions.

## Summary

- **Consul KV store** acts as the persistent source of truth for peer identities (`mpc_peers/`) and readiness states (`mpc_ready/`).
- **`LoadPeersFromConsul`** in [`pkg/config/peers.go`](https://github.com/fystack/mpcium/blob/main/pkg/config/peers.go) performs initial peer discovery at node startup.
- **`PeerRegistry`** in [`pkg/mpc/registry.go`](https://github.com/fystack/mpcium/blob/main/pkg/mpc/registry.go) maintains runtime state, watches Consul for changes, and tracks ECDH completion.
- **Global readiness** requires both that all peers have written ready flags to Consul and that ECDH key-exchange has completed with every participant.
- **Query methods** like `ArePeersReady()` and `AreMajorityReady()` provide simple APIs for protocol workers to check availability.
- **Automatic recovery** occurs when vanished peers re-register, allowing the system to heal without manual intervention.

## Frequently Asked Questions

### How does mpcium detect when a peer becomes unavailable?

The `PeerRegistry` runs a continuous watch loop via `WatchPeersReady()` that performs blocking `kv.List` operations on the `mpc_ready/` prefix in Consul. When a peer's ready key disappears from the KV store, the watch detects the change and calls `registerReadyPairs` to remove the peer from the internal `readyMap`, immediately marking the global state as not ready until the peer returns.

### What conditions must be met before mpcium starts a multi-party computation?

According to `checkAndUpdateReadyState` in [`pkg/mpc/registry.go`](https://github.com/fystack/mpcium/blob/main/pkg/mpc/registry.go), the registry sets `r.ready = true` only when two conditions satisfy simultaneously: all expected peers have written a ready flag to Consul (`readyPeersCount == len(r.peerNodeIDs)`), and the ECDH key-exchange has completed with every peer (`r.isECDHReady()` returns true).

### Can the Peer Registry operate without Consul?

No, the `PeerRegistry` requires a Consul KV implementation as a mandatory dependency. The `NewRegistry` constructor accepts a `ConsulKV` interface, and methods like `Ready()` and `WatchPeersReady()` directly depend on Consul's atomic operations and blocking queries for correctness. However, the `ConsulKV` interface abstraction allows testing with mock implementations.

### How does mpcium handle authentication with Consul?

The `infra.GetConsulClient` function configures authentication based on the environment. In production, it reads `consul.token` for ACL-based authentication and optionally sets `consul.username` and `consul.password` for HTTP basic auth. These credentials are injected via Viper configuration management, and the client validates connectivity to the Consul leader before returning.