how-to-guide

Running AutoGen Agents with Distributed Runtime Across Multiple Machines: A Complete Guide

March 7, 2026 microsoft/autogen ↗

AutoGen's distributed runtime uses a gRPC-based gateway to let you deploy agents across multiple machines while maintaining the same AgentRuntime API and message-handler semantics as single-process execution.

The microsoft/autogen repository provides a production-ready distributed agent runtime that enables running AutoGen agents with distributed runtime across multiple machines without rewriting your agent logic. By leveraging a gRPC host-worker architecture, you can scale agent workloads horizontally while preserving the familiar send_message, publish_message, and subscribe patterns used in single-process applications.

Architecture of the Distributed Agent Runtime

AutoGen's distributed system abstracts networking complexity through a clean separation between a central gRPC host and lightweight worker runtimes.

Host and Worker Components

The architecture consists of two primary components defined in the autogen_ext.runtimes.grpc package:

GrpcWorkerAgentRuntimeHost – A lightweight gRPC server implemented in [_worker_runtime_host.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-ext/src/autogen_ext/runtimes/grpc/_worker_runtime_host.py) that binds to a network address and brokers messages between workers. It exposes start(), stop(), and stop_when_signal() for lifecycle management.
GrpcWorkerAgentRuntime – A client-side runtime implemented in [_worker_runtime.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-ext/src/autogen_ext/runtimes/grpc/_worker_runtime.py) that connects to the host, registers local agents, and maintains a read loop processing request, response, and cloudEvent messages.

Both components share the same protobuf contracts defined in protos/agent_worker.proto and inherit from the abstract AgentRuntime base class in [application/_runtime.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-core/src/autogen_core/application/_runtime.py).

Message Flow Across the Network

When running AutoGen agents with distributed runtime across multiple machines, messages follow a standardized flow:

Agent to Runtime – An agent calls runtime.publish_message(). The GrpcWorkerAgentRuntime serializes the payload and sends an RpcRequest to the host.
Host to Servicer – The GrpcWorkerAgentRuntimeHostServicer (implemented in [_worker_runtime_host_servicer.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-ext/src/autogen_ext/runtimes/grpc/_worker_runtime_host_servicer.py)) receives the RPC call and forwards it to the host's internal MessageRouter.
Router to Subscribers – The host dispatches the message to locally registered agents or publishes it to a topic, which broadcasts to all subscribed workers.
Worker to Agent – Remote workers receive a Message of type cloudEvent. The worker's read loop processes the event and invokes the corresponding message_handler on the target agent.

Because both local and distributed runtimes implement the same AgentRuntime interface, you use identical @message_handler decorators and agent logic regardless of deployment topology.

Setting Up the gRPC Host Server

To run AutoGen agents with distributed runtime across multiple machines, start by deploying the gRPC host on a machine with a publicly accessible IP address.

Create a host script that instantiates GrpcWorkerAgentRuntimeHost and binds to all interfaces:


# host.py

from autogen_ext.runtimes.grpc import GrpcWorkerAgentRuntimeHost

if __name__ == "__main__":
    # Bind to a public address reachable by workers

    host = GrpcWorkerAgentRuntimeHost(address="0.0.0.0:50051")
    host.start()
    # Keep the process alive until Ctrl-C (or SIGTERM) is received

    import asyncio
    asyncio.run(host.stop_when_signal())

The stop_when_signal() method ensures graceful shutdown when the process receives SIGINT or SIGTERM, allowing pending messages to complete before the server closes connections.

Connecting Worker Runtimes from Remote Machines

Once the host is running, connect worker runtimes from other machines using GrpcWorkerAgentRuntime. Workers register agent factories and subscribe to topics exactly as they would with a local runtime.

Here is a minimal worker implementation that registers an echo agent:


# worker.py

import asyncio
from autogen_core import Agent, RoutedAgent, MessageContext, message_handler
from autogen_core.models import AssistantMessage, UserMessage
from autogen_ext.runtimes.grpc import GrpcWorkerAgentRuntime
from autogen_core.application import AgentRuntime

class EchoAgent(RoutedAgent):
    """Echoes back the received user message in upper case."""
    def __init__(self):
        super().__init__("Echo")

    @message_handler
    async def handle_user(self, msg: UserMessage, ctx: MessageContext):
        reply = AssistantMessage(content=msg.content.upper())
        await self.publish_message(reply, ctx.topic_id)   # reply on same topic

async def main():
    # Connect to the host started above

    runtime = GrpcWorkerAgentRuntime(host_address="host.example.com:50051")
    await runtime.start()
    runtime.register_agent_factory("echo", lambda: EchoAgent())
    await runtime.subscribe(agent_type="echo", topic_type="chat")   # subscribe to "chat"

    # Keep the runtime alive

    await runtime.stop_when_signal()

if __name__ == "__main__":
    asyncio.run(main())

The GrpcWorkerAgentRuntime handles connection retries, maintains a read loop for incoming messages, and serializes payloads using the same registry as the host. You can deploy this worker script on any machine that has network access to the host's port 50051.

Running the Distributed Group Chat Sample

The microsoft/autogen repository includes a complete demonstration of running AutoGen agents with distributed runtime across multiple machines in the core_distributed-group-chat sample. This example orchestrates a writer, editor, manager, and UI agent across separate processes.

Execute the sample by launching each component in a separate terminal:


# Terminal 1 – start the host

python python/samples/core_distributed-group-chat/run_host.py

# Terminal 2 – start UI (runs in the same process as the host but publishes UI messages)

python python/samples/core_distributed-group-chat/run_ui.py

# Terminal 3 – start writer agent (remote worker)

python python/samples/core_distributed-group-chat/run_writer_agent.py

# Terminal 4 – start editor agent (remote worker)

python python/samples/core_distributed-group-chat/run_editor_agent.py

# Terminal 5 – start the group-chat manager (remote worker)

python python/samples/core_distributed-group-chat/run_group_chat_manager.py

Each script imports either GrpcWorkerAgentRuntimeHost or GrpcWorkerAgentRuntime and uses the standard message_handler pattern. The sample source files demonstrate how to structure agent factories, handle topics, and manage distributed state in a real-world scenario.

Key Implementation Files and Source References

When running AutoGen agents with distributed runtime across multiple machines, these source files define the core behavior:

File	Purpose
`python/packages/autogen-ext/src/autogen_ext/runtimes/grpc/_worker_runtime_host.py`	Implements `GrpcWorkerAgentRuntimeHost`, the gRPC server that brokers messages between workers.
`python/packages/autogen-ext/src/autogen_ext/runtimes/grpc/_worker_runtime_host_servicer.py`	Implements `GrpcWorkerAgentRuntimeHostServicer`, handling incoming RPC calls and forwarding to the host's router.
`python/packages/autogen-ext/src/autogen_ext/runtimes/grpc/_worker_runtime.py`	Implements `GrpcWorkerAgentRuntime`, the client-side runtime that connects to the host and manages the read loop.
`python/packages/autogen-core/src/autogen_core/application/_runtime.py`	Defines the abstract `AgentRuntime` base class shared by local and distributed implementations.
`protos/agent_worker.proto`	Protobuf definitions for `Message`, `RpcRequest`, `RpcResponse`, and `CloudEvent` wire formats.
`python/samples/core_distributed-group-chat/_agents.py`	Agent implementations demonstrating distributed patterns.
`python/samples/core_distributed-group-chat/run_host.py`	Entry point for launching the gRPC host.
`python/samples/core_distributed-group-chat/run_*.py`	Worker launch scripts for individual agents.

Summary

Running AutoGen agents with distributed runtime across multiple machines enables horizontal scaling of agent workloads while preserving the single-process programming model. Key takeaways include:

GrpcWorkerAgentRuntimeHost acts as a central message broker using gRPC, implemented in _worker_runtime_host.py.
GrpcWorkerAgentRuntime connects remote workers to the host, providing the same AgentRuntime API as local execution.
Agents use identical @message_handler decorators regardless of whether they run in-process or across the network.
The core_distributed-group-chat sample demonstrates a complete multi-machine deployment with writer, editor, manager, and UI agents.
Protobuf contracts in agent_worker.proto enable cross-language interoperability between Python and .NET agents.

Frequently Asked Questions

Do I need to install extra dependencies for the distributed runtime?

Yes. The distributed runtime requires grpcio and grpcio-tools. Install them via pip install "autogen[grpc]" or manually with pip install grpcio grpcio-tools. These packages provide the gRPC client and server implementations that GrpcWorkerAgentRuntime and GrpcWorkerAgentRuntimeHost depend on.

Can I mix .NET and Python agents in the same distributed deployment?

Absolutely. The host can be a .NET process using Microsoft.AutoGen.RuntimeGateway.Grpc while workers run Python (or vice versa) as long as both use the same protobuf contracts defined in agent_worker.proto and cloudevent.proto. The serialization registry handles JSON or protobuf payloads transparently across language boundaries.

How do I choose between JSON and Protobuf payload serialization?

When constructing a GrpcWorkerAgentRuntime, pass payload_serialization_format= either JSON_DATA_CONTENT_TYPE (default) or PROTOBUF_DATA_CONTENT_TYPE. JSON is human-readable and easier to debug, while Protobuf offers better performance and smaller payload sizes for large binary data or high-throughput scenarios.

What happens if a worker runtime crashes or disconnects?

The host's read loop logs the error but continues serving other connected workers. Workers can be restarted independently; upon reconnection, the host will resend any missed subscriptions and the worker re-registers its agents. However, in-flight messages to the crashed worker may be lost unless you implement application-level acknowledgment patterns.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how microsoft/autogen works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →