Setting up VectorClockManager for Causal Ordering in Multi-Agent Sessions

Initialize a VectorClockManager instance and configure your session with IsolationLevel.SERIALIZABLE to enable vector clock-based causal ordering across distributed agent writes.

The Microsoft Agent Governance Toolkit (AGT) provides deterministic causal ordering for multi-agent sessions through vector clock tracking. By implementing the VectorClockManager from the hypervisor.session.vector_clock module, you can establish happens-before relationships between operations performed by different agents in shared sessions. This guide explains how to set up and configure the manager for production-grade optimistic concurrency control.

Understanding Vector Clocks in AGT

A vector clock is a distributed systems primitive that tracks per-agent version counters to establish causal relationships between events. In AGT, each agent maintains its own logical clock component, incrementing it on every write operation.

According to the source code in agent-governance-python/agent-hypervisor/src/hypervisor/session/vector_clock.py, the system uses component-wise comparison to determine whether one operation happened-before another, whether two operations are concurrent, or whether a write would violate causality. This enables optimistic concurrency where agents work in parallel without immediate locking, with conflicts detected only when causal order would be violated.

Core Components

VectorClock

The VectorClock class (lines 21–99 in vector_clock.py) holds a dictionary clocks: Dict[str, int] mapping agent_did strings to version integers. It provides the essential operations for causal tracking:

  • tick – Increments the calling agent's component counter
  • merge – Performs component-wise maximum to combine two clocks
  • happens_before – Determines if one clock strictly precedes another
  • is_concurrent – Detects when two operations have no causal relationship

Each VectorClock instance owns a threading.Lock to ensure thread-safe mutations in multi-threaded hypervisor environments.

VectorClockManager

The VectorClockManager class (lines 100–141) serves as the central registry tracking the latest clock for every path (_path_clocks) and every agent (_agent_clocks). Key methods include:

  • write(path, agent_did, strict=True) – Advances the agent's clock and stores a snapshot for the target path
  • read(path, agent_did) – Returns a copy of the current path clock without modification
  • conflict_count – Property exposing the number of detected conflicts
  • tracked_paths – Property listing all monitored resource paths

CausalViolationError

Defined at line 17 in vector_clock.py, this exception type is raised when a strict write detects a causal conflict. While the current public preview never raises this exception, the API structure is production-ready for strict enforcement.

Configuring SERIALIZABLE Isolation

Vector clocks are required only when using IsolationLevel.SERIALIZABLE. In agent-governance-python/agent-hypervisor/src/hypervisor/session/isolation.py (lines 31–34), the IsolationLevel enum defines the requires_vector_clocks property:

SERIALIZABLE = auto()

@property
def requires_vector_clocks(self) -> bool:
    return self == IsolationLevel.SERIALIZABLE

When a session is created with IsolationLevel.SERIALIZABLE, AGT automatically consults the VectorClockManager to enforce ordering across agents.

Implementation Guide

Basic Setup

Instantiate one manager per hypervisor instance at session startup:

from hypervisor.session.vector_clock import VectorClockManager

# One manager per hypervisor instance

vc_manager = VectorClockManager()

Tracking Writes and Reads

Use the manager to track file-level operations across agents:

from hypervisor.session.vector_clock import VectorClockManager, CausalViolationError

# Initialize

vc_mgr = VectorClockManager()

# Agent "a1" writes to path "/data/file1"

vc_mgr.write("/data/file1", agent_did="a1")

# a1's clock for this path is now 1

# Agent "a2" reads the current state

a2_view = vc_mgr.read("/data/file1", agent_did="a2")
print(a2_view.get("a1"))  # Output: 1

Merging Clocks from Different Agents

When reconciling distributed state, use the merge method for component-wise maximum:

from hypervisor.session.vector_clock import VectorClock

vc1 = VectorClock(clocks={"a1": 3, "a2": 1})
vc2 = VectorClock(clocks={"a1": 2, "a3": 4})

# Component-wise max: {"a1":3, "a2":1, "a3":4}

merged = vc1.merge(vc2)

Integration with Serializable Sessions

Wire the manager into a SERIALIZABLE session to enable automatic enforcement:

from hypervisor.session.isolation import IsolationLevel
from hypervisor import Hypervisor, SessionConfig

# Create hypervisor with strict isolation

hyper = Hypervisor()
session_cfg = SessionConfig(isolation_level=IsolationLevel.SERIALIZABLE)
session = hyper.create_session(
    session_id="s1", 
    agent_did="a1", 
    config=session_cfg
)

# Access the manager via session (exposed internally by Hypervisor)

vc_mgr = session.vector_clock_manager

# All writes now respect causal ordering

vc_mgr.write("/shared/doc.txt", agent_did="a1")

# Strict write (future-proof: will raise CausalViolationError when enforcement is enabled)

try:
    vc_mgr.write("/shared/doc.txt", agent_did="a2", strict=True)
except CausalViolationError:
    print("Causal violation detected: a2 is behind current state")

Thread Safety and Concurrency

The VectorClockManager is thread-safe by design. Each VectorClock maintains its own threading.Lock, and the merge operation uses deterministic lock ordering via sorted([self, other], key=id) to prevent deadlocks when combining clocks from multiple agents.

This lock-protected design scales to hypervisor instances hosting many concurrent agents, enabling optimistic concurrency where agents proceed in parallel until a causal violation is detected.

Summary

  • VectorClockManager tracks causal relationships across agents via per-agent version counters stored in agent-governance-python/agent-hypervisor/src/hypervisor/session/vector_clock.py
  • Initialize the manager at hypervisor startup and use it for all file-level write and read operations
  • Enable strict causal ordering by configuring sessions with IsolationLevel.SERIALIZABLE, which sets requires_vector_clocks to True
  • Merge clocks using component-wise maximum to reconcile distributed state without losing version information
  • Leverage thread-safe design with per-clock locks and deterministic lock ordering for multi-threaded deployments

Frequently Asked Questions

How does VectorClockManager establish happens-before relationships between agents?

The manager uses the VectorClock.happens_before method to perform component-wise comparison of clock dictionaries. If every agent's timestamp in clock A is less than or equal to the corresponding timestamp in clock B, and at least one is strictly less, then operation A happened-before operation B. This comparison occurs in vector_clock.py within the VectorClock class implementation.

What is the difference between strict mode and non-strict mode writes?

When strict=True (the default), the VectorClockManager.write method compares the agent's current clock with the path's latest clock and would raise CausalViolationError if the write would move "backwards" in the causality graph. In the current public preview, this check always succeeds to maintain backward compatibility, but the API structure is ready for production enforcement when the feature is activated.

Why must I use IsolationLevel.SERIALIZABLE with VectorClockManager?

The IsolationLevel.SERIALIZABLE enum value is the only level where requires_vector_clocks returns True (as defined in isolation.py lines 31–34). This coupling signals the AGT runtime to invoke the VectorClockManager for every write and read operation, ensuring serializable semantics across agents. Other isolation levels do not consult the vector clock registry, bypassing causal ordering checks.

How does the merge operation prevent deadlocks in multi-threaded environments?

The VectorClock.merge method implements deterministic lock ordering by sorting the two clock instances by their Python id before acquiring locks: sorted([self, other], key=id). This ensures that all threads acquire locks in the same order, eliminating circular wait conditions that could cause deadlocks when multiple agents simultaneously merge clocks in the hypervisor.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →