How to Implement Multi-Agent Orchestration with Handoff Patterns in the Microsoft Agent Framework

Use the HandoffBuilder class to configure autonomous agents that route control to one another via synthetic handoff tools, eliminating the need for a central coordinator.

The Microsoft Agent Framework provides a decentralized multi-agent orchestration pattern that lets specialized agents transfer conversations dynamically using handoff patterns. Unlike traditional group-chat orchestrators that rely on a central dispatcher, this pattern allows individual agents to invoke handoff_to_<target> tools and transfer control atomically.

Core Architecture of the Handoff Pattern

The handoff implementation consists of four primary components that work together to replace centralized coordination with agent-driven routing.

HandoffBuilder and Configuration

The HandoffBuilder class (defined in python/packages/orchestrations/agent_framework_orchestrations/_handoff.py) provides a fluent API for declaring participants and routing rules. When you call .add_handoff(source, targets), the builder creates HandoffConfiguration objects that document valid transitions between agents.

If you omit explicit routing rules, the builder automatically constructs a mesh topology where every agent can hand off to every other participant (lines 118-129 of _handoff.py).

Runtime Execution with HandoffAgentExecutor

When .build() executes, the framework wraps each Agent in a HandoffAgentExecutor (lines 40-62). This executor performs three critical functions:

  1. Tool Injection: It generates synthetic handoff tools via _apply_auto_tools (lines 13-22), decorating them with @tool(..., approval_mode="never_require") to ensure immediate execution without user confirmation.
  2. Middleware Interception: The _AutoHandoffMiddleware monitors tool calls and raises MiddlewareTermination when it detects a handoff request (lines 36-46).
  3. State Management: It implements on_checkpoint_save and on_checkpoint_restore to persist the autonomous-mode turn counter (lines 48-61).

Unified State Persistence

The OrchestrationState class (in python/packages/orchestrations/agent_framework_orchestrations/_orchestration_state.py) maintains the conversation history, round index, metadata, and primary task across the distributed workflow. This structure enables checkpointing so that long-running workflows can survive server restarts.

How Agent-to-Agent Handoff Routing Works

Understanding the runtime mechanics helps debug routing issues and optimize handoff decisions.

Synthetic Tool Generation

For each valid handoff target configured in the builder, _apply_auto_tools creates a synthetic function named handoff_to_<target_id>. These functions have no actual implementation; instead, the _AutoHandoffMiddleware.process method intercepts the call, injects a result payload containing {"handoff_to": <target_id>}, and terminates normal execution (lines 36-46).

Detecting and Executing Transfers

After each agent response, HandoffAgentExecutor._is_handoff_requested inspects the last message for a function_result payload containing the HANDOFF_FUNCTION_RESULT_KEY ("handoff_to") (lines 92-121). When detected, the executor:

  1. Extracts the target agent ID
  2. Sends an AgentExecutorRequest directly to the target via ctx.send_message(..., target_id=handoff_target) (lines 124-132)
  3. Emits a handoff_sent event for monitoring

The underlying WorkflowBuilder maintains a fully-connected fan-out graph that broadcasts all responses to every participant, ensuring conversation histories remain synchronized across the mesh.

Autonomous Mode and Checkpointing

The framework supports autonomous execution where agents can perform multi-turn reasoning without interrupting the user.

Enabling Autonomous Workflows

When you configure .with_autonomous_mode(agents=[...]), the HandoffAgentExecutor enters an internal loop after each response. It continues executing until either:

  • A handoff tool is invoked (transferring control)
  • The turn limit is reached
  • A termination condition is met (lines 331-344)

This pattern is essential for research agents or coding assistants that need to iterate internally before presenting results or escalating to a specialist.

Persistent State Management

The InMemoryCheckpointStorage (or custom CheckpointStorage implementations) serializes the OrchestrationState after each turn. By passing a consistent session_id to workflow.run(), you can resume interrupted conversations across process restarts—a critical requirement for customer support tickets or long-running analysis tasks.

Complete Implementation Example

Below is a runnable customer support workflow that demonstrates explicit routing, autonomous mode, and checkpoint persistence.

from agent_framework import OpenAIChatClient
from agent_framework.orchestrations import HandoffBuilder, InMemoryCheckpointStorage

# Initialize the LLM client

client = OpenAIChatClient()

# Define specialized agents with distinct system prompts

triage = client.as_agent(
    instructions="You are a triage agent. Identify if the user needs a refund or order status.",
    name="triage_agent",
)

refund = client.as_agent(
    instructions="You handle refund requests. Provide refund amount and steps.",
    name="refund_agent",
)

status = client.as_agent(
    instructions="You provide order shipping status and tracking numbers.",
    name="status_agent",
)

# Configure the handoff workflow with checkpointing

storage = InMemoryCheckpointStorage()

workflow = (
    HandoffBuilder(participants=[triage, refund, status])
    .add_handoff(triage, [refund, status])  # Explicit routing from triage

    .with_start_agent(triage)              # Entry point agent

    .with_autonomous_mode(agents=[triage])  # Let triage work without prompting

    .with_checkpointing(storage)            # Enable state persistence

    .with_termination_condition(
        lambda conv: any("goodbye" in msg.contents[0].lower() for msg in conv[-2:])
        or len(conv) > 30
    )
    .build()
)

# Execute with streaming events

async def run_support_ticket():
    async for event in workflow.run(
        "I bought a phone and it arrived broken. I want a refund.",
        session_id="ticket_789",
        stream=True,
    ):
        if event.type == "message":
            print(f"Agent: {event.data}")
        elif event.type == "handoff_sent":
            print(f"Routing: {event.data.source}{event.data.target}")

# Resume later using the same session_id

# async for event in workflow.run("Any update?", session_id="ticket_789", stream=True):

#     ...

Key Implementation Details

  • Explicit Routing: The add_handoff(triage, [refund, status]) call (lines 101-118 of _handoff.py) restricts the triage agent to specific specialists rather than the default mesh topology.
  • Autonomous Processing: The triage agent runs internally until it decides to transfer control, checking termination conditions after each turn (lines 26-45).
  • State Recovery: The session_id parameter binds the execution to a specific OrchestrationState instance in the checkpoint storage, enabling resumable workflows across server restarts.

Summary

  • HandoffBuilder is the fluent entry point for configuring multi-agent routing topologies without central coordination.
  • Synthetic handoff tools are automatically injected into each agent and intercepted by _AutoHandoffMiddleware to trigger routing.
  • Autonomous mode allows agents to perform multi-turn work internally before handing off, controlled by turn limits and termination conditions.
  • Checkpointing via OrchestrationState and InMemoryCheckpointStorage makes workflows durable and resumable across process restarts.
  • The architecture uses a decentralized mesh where agents decide when to transfer control, while the underlying graph keeps all participants synchronized.

Frequently Asked Questions

How does the Agent Framework handle handoff routing without a central orchestrator?

The framework injects synthetic handoff_to_<target_id> tools into each agent's toolkit via HandoffAgentExecutor._apply_auto_tools (lines 13-22 of _handoff.py). When an agent invokes one of these tools, the _AutoHandoffMiddleware intercepts the call, raises MiddlewareTermination, and the executor routes the conversation directly to the target agent using ctx.send_message(..., target_id=handoff_target) (lines 124-132).

What is the difference between explicit handoff routing and the default mesh topology?

Explicit routing requires calling .add_handoff(source, [targets]) to define valid transitions. If you skip this step, the HandoffBuilder automatically creates a mesh topology where every participant can hand off to every other participant (lines 118-129 of _handoff.py). Explicit routing is recommended for production workflows to prevent agents from transferring to inappropriate specialists.

How do I resume a handoff workflow after a server restart?

Pass a consistent session_id string to workflow.run() and configure checkpointing via .with_checkpointing(storage). The HandoffAgentExecutor implements on_checkpoint_save and on_checkpoint_restore (lines 48-61) to serialize the autonomous-mode counter and conversation state. On restart, using the same session_id loads the previous OrchestrationState and continues execution from the last completed turn.

Can agents perform multiple steps before handing off to another agent?

Yes. Enable autonomous mode by calling .with_autonomous_mode(agents=[agent_name]) on the builder. This configures the HandoffAgentExecutor to loop internally after each response, allowing the agent to continue reasoning until it invokes a handoff tool, hits the turn limit, or triggers a termination condition (lines 331-344 of _handoff.py).

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →