Implementing QuarantineManager for Isolating Untrusted Agents in Microsoft's Agent Governance Toolkit
The Agent Governance Toolkit provides a QuarantineManager class in the hypervisor liability package that logs isolation events for agents violating policy or exceeding risk budgets, though in public-preview mode it acts as a non-blocking stub that records events without enforcing network or execution restrictions.
The QuarantineManager serves as the central auditing mechanism for agent governance, allowing downstream policy engines and monitoring dashboards to track untrusted behavior without breaking existing integrations. Located in the microsoft/agent-governance-toolkit repository, this component follows a deliberate stub design that ensures safe deployment in production environments while maintaining API stability for future enforcement capabilities.
Core Architecture and Stub Design Philosophy
The implementation resides in agent-governance-python/agent-hypervisor/src/hypervisor/liability/quarantine.py and follows a stub pattern specifically engineered for public-preview safety. Unlike traditional quarantine systems that immediately block traffic or terminate processes, this manager exclusively handles event recording and querying.
When checking if an agent is quarantined via is_quarantined(), the method always returns False in preview mode. Similarly, active_quarantines and quarantine_count properties report zero values regardless of the actual record history. This design guarantees that existing demos and integrations—such as the OpenAI-agents examples—continue functioning while still capturing liability data for observability purposes.
Key Components of the Quarantine System
QuarantineReason Enumeration
The QuarantineReason enum defined at lines 18-26 categorizes why an agent requires isolation:
- BEHAVIORAL_DRIFT: Agent actions deviated from expected patterns
- LIABILITY_VIOLATION: Causal attribution detected policy breaches
- RING_BREACH: Attempted escalation beyond permission boundaries
- RATE_LIMIT: Exceeded operational thresholds
- MANUAL: Human administrator intervention
- CASCADE_SLASH: Derived from dependent agent failures
QuarantineRecord Dataclass
Stored at lines 29-44, the QuarantineRecord captures forensic evidence with fields for quarantine_id, agent_did, session_id, reason, created_at, optional expiry, released_at, and forensic_data. The record tracks whether it remains active and calculates duration_seconds using either the release timestamp or current time.
QuarantineManager Public API
The manager class (lines 56-119) exposes the primary interface for hypervisor integration. It maintains an in-memory dictionary self._quarantines keyed by generated UUIDs, providing methods to create, release, and query isolation records without external dependencies.
Integration with the Hypervisor Liability Layer
The quarantine system operates as the final stage in the agent governance pipeline:
- Agent Runtime forwards step results to the hypervisor after execution
- CausalAttributor calculates liability scores in
hypervisor/liability/attribution.py - Policy Decision determines if thresholds warrant isolation
- QuarantineManager.quarantine() creates the audit record
- Monitoring Consumption via
get_history()feeds dashboards and compliance reports
This architecture decouples detection from enforcement, allowing organizations to run the toolkit in observation mode before enabling active blocking.
Practical Implementation Example
The following demonstrates logging and querying quarantine events using the actual API surface:
from hypervisor.liability.quarantine import QuarantineManager, QuarantineReason
# Initialize singleton (typically managed by hypervisor container)
qm = QuarantineManager()
# Create quarantine record for liability violation
record = qm.quarantine(
agent_did="did:example:agent123",
session_id="sess-42",
reason=QuarantineReason.LIABILITY_VIOLATION,
details="Repeated policy breach in step 7",
duration_seconds=600, # Preview mode ignores enforcement
forensic_data={"failed_actions": ["act-7", "act-9"]},
)
print(f"Quarantine logged with ID: {record.quarantine_id}")
# Audit specific agent history
history = qm.get_history(agent_did="did:example:agent123")
print(f"Total quarantine events: {len(history)}")
# Check current status (always False in preview)
if qm.is_quarantined("did:example:agent123", "sess-42"):
print("Agent currently blocked")
else:
print("Preview mode: no enforcement active")
# Release quarantine (returns None in preview)
qm.release("did:example:agent123", "sess-42")
Summary
- Non-blocking stub: The
QuarantineManagerinquarantine.pyrecords isolation events without enforcing restrictions in public-preview mode - Rich audit trail:
QuarantineRecordcaptures forensic data including agent DID, session context, reasoning enums, and duration calculations - Query API:
get_history()provides filtered access to quarantine logs for compliance dashboards and policy engines - Safe defaults: Methods like
is_quarantined()always return safe values (False, empty lists, zero counts) until enforcement mode activates - Future-proof: Stable API design ensures downstream code requires no changes when real isolation logic replaces stub implementations
Frequently Asked Questions
What is the difference between preview mode and enforcement mode in QuarantineManager?
In preview mode, all QuarantineManager methods return safe default values that never block execution—is_quarantined() returns False, active_quarantines returns an empty list, and quarantine_count returns zero. The system only logs events to an in-memory store. When the toolkit transitions to enforcement mode, these same methods will return actual quarantine states and blocking capabilities without requiring API changes in consuming code.
How does the QuarantineManager store and retrieve isolation records?
The manager maintains records in an internal dictionary self._quarantines keyed by generated UUID strings. When quarantine() is called, it creates a QuarantineRecord dataclass instance and stores it immediately. Retrieval occurs through get_history(), which supports optional filtering by agent_did or session_id parameters, or via get_active_quarantine() for specific agent-session combinations.
Can I extend the QuarantineReason enum for custom policy violations?
The QuarantineReason enum in quarantine.py defines standard categories like LIABILITY_VIOLATION and BEHAVIORAL_DRIFT. While the source code provides these six standard reasons, the quarantine() method accepts the reason parameter as a QuarantineReason type. To add custom reasons, you would need to extend the enum definition in the source file or map your custom categories to the existing MANUAL reason while storing specific violation types in the forensic_data dictionary field.
Where should QuarantineManager be instantiated in a production deployment?
The toolkit initializes QuarantineManager as a singleton within the hypervisor container, typically exposed through agent-governance-python/agent-hypervisor/src/hypervisor/__init__.py. Downstream services should retrieve the instance from the hypervisor's dependency injection container rather than constructing it directly, ensuring consistent state across the liability attribution and policy enforcement pipeline.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →