how-to-guide

Implementing QuarantineManager for Isolating Untrusted Agents in Microsoft's Agent Governance Toolkit

May 29, 2026 microsoft/agent-governance-toolkit ↗

The Agent Governance Toolkit provides a QuarantineManager class in the hypervisor liability package that logs isolation events for agents violating policy or exceeding risk budgets, though in public-preview mode it acts as a non-blocking stub that records events without enforcing network or execution restrictions.

The QuarantineManager serves as the central auditing mechanism for agent governance, allowing downstream policy engines and monitoring dashboards to track untrusted behavior without breaking existing integrations. Located in the microsoft/agent-governance-toolkit repository, this component follows a deliberate stub design that ensures safe deployment in production environments while maintaining API stability for future enforcement capabilities.

Core Architecture and Stub Design Philosophy

The implementation resides in agent-governance-python/agent-hypervisor/src/hypervisor/liability/quarantine.py and follows a stub pattern specifically engineered for public-preview safety. Unlike traditional quarantine systems that immediately block traffic or terminate processes, this manager exclusively handles event recording and querying.

When checking if an agent is quarantined via is_quarantined(), the method always returns False in preview mode. Similarly, active_quarantines and quarantine_count properties report zero values regardless of the actual record history. This design guarantees that existing demos and integrations—such as the OpenAI-agents examples—continue functioning while still capturing liability data for observability purposes.

Key Components of the Quarantine System

QuarantineReason Enumeration

The QuarantineReason enum defined at lines 18-26 categorizes why an agent requires isolation:

BEHAVIORAL_DRIFT: Agent actions deviated from expected patterns
LIABILITY_VIOLATION: Causal attribution detected policy breaches
RING_BREACH: Attempted escalation beyond permission boundaries
RATE_LIMIT: Exceeded operational thresholds
MANUAL: Human administrator intervention
CASCADE_SLASH: Derived from dependent agent failures

QuarantineRecord Dataclass

Stored at lines 29-44, the QuarantineRecord captures forensic evidence with fields for quarantine_id, agent_did, session_id, reason, created_at, optional expiry, released_at, and forensic_data. The record tracks whether it remains active and calculates duration_seconds using either the release timestamp or current time.

QuarantineManager Public API

The manager class (lines 56-119) exposes the primary interface for hypervisor integration. It maintains an in-memory dictionary self._quarantines keyed by generated UUIDs, providing methods to create, release, and query isolation records without external dependencies.

Integration with the Hypervisor Liability Layer

The quarantine system operates as the final stage in the agent governance pipeline:

Agent Runtime forwards step results to the hypervisor after execution
CausalAttributor calculates liability scores in hypervisor/liability/attribution.py
Policy Decision determines if thresholds warrant isolation
QuarantineManager.quarantine() creates the audit record
Monitoring Consumption via get_history() feeds dashboards and compliance reports

This architecture decouples detection from enforcement, allowing organizations to run the toolkit in observation mode before enabling active blocking.

Practical Implementation Example

The following demonstrates logging and querying quarantine events using the actual API surface:

from hypervisor.liability.quarantine import QuarantineManager, QuarantineReason

# Initialize singleton (typically managed by hypervisor container)

qm = QuarantineManager()

# Create quarantine record for liability violation

record = qm.quarantine(
    agent_did="did:example:agent123",
    session_id="sess-42",
    reason=QuarantineReason.LIABILITY_VIOLATION,
    details="Repeated policy breach in step 7",
    duration_seconds=600,  # Preview mode ignores enforcement

    forensic_data={"failed_actions": ["act-7", "act-9"]},
)

print(f"Quarantine logged with ID: {record.quarantine_id}")

# Audit specific agent history

history = qm.get_history(agent_did="did:example:agent123")
print(f"Total quarantine events: {len(history)}")

# Check current status (always False in preview)

if qm.is_quarantined("did:example:agent123", "sess-42"):
    print("Agent currently blocked")
else:
    print("Preview mode: no enforcement active")

# Release quarantine (returns None in preview)

qm.release("did:example:agent123", "sess-42")

Summary

Non-blocking stub: The QuarantineManager in quarantine.py records isolation events without enforcing restrictions in public-preview mode
Rich audit trail: QuarantineRecord captures forensic data including agent DID, session context, reasoning enums, and duration calculations
Query API: get_history() provides filtered access to quarantine logs for compliance dashboards and policy engines
Safe defaults: Methods like is_quarantined() always return safe values (False, empty lists, zero counts) until enforcement mode activates
Future-proof: Stable API design ensures downstream code requires no changes when real isolation logic replaces stub implementations

Frequently Asked Questions

What is the difference between preview mode and enforcement mode in QuarantineManager?

In preview mode, all QuarantineManager methods return safe default values that never block execution—is_quarantined() returns False, active_quarantines returns an empty list, and quarantine_count returns zero. The system only logs events to an in-memory store. When the toolkit transitions to enforcement mode, these same methods will return actual quarantine states and blocking capabilities without requiring API changes in consuming code.

How does the QuarantineManager store and retrieve isolation records?

The manager maintains records in an internal dictionary self._quarantines keyed by generated UUID strings. When quarantine() is called, it creates a QuarantineRecord dataclass instance and stores it immediately. Retrieval occurs through get_history(), which supports optional filtering by agent_did or session_id parameters, or via get_active_quarantine() for specific agent-session combinations.

Can I extend the QuarantineReason enum for custom policy violations?

The QuarantineReason enum in quarantine.py defines standard categories like LIABILITY_VIOLATION and BEHAVIORAL_DRIFT. While the source code provides these six standard reasons, the quarantine() method accepts the reason parameter as a QuarantineReason type. To add custom reasons, you would need to extend the enum definition in the source file or map your custom categories to the existing MANUAL reason while storing specific violation types in the forensic_data dictionary field.

Where should QuarantineManager be instantiated in a production deployment?

The toolkit initializes QuarantineManager as a singleton within the hypervisor container, typically exposed through agent-governance-python/agent-hypervisor/src/hypervisor/__init__.py. Downstream services should retrieve the instance from the hypervisor's dependency injection container rather than constructing it directly, ensuring consistent state across the liability attribution and policy enforcement pipeline.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how microsoft/agent-governance-toolkit works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →