migration-guide

Migrating from AutoGen v0.2 to the New AgentChat API: A Complete Guide

March 7, 2026 microsoft/autogen ↗

AutoGen v0.4 replaces the monolithic, synchronous agent classes of v0.2 with an async, component-based AgentChat API built on top of the Core event-driven framework, requiring explicit model clients, async message handling, and new state persistence patterns.

The microsoft/autogen repository has undergone a fundamental architectural redesign in version 0.4. The migration from AutoGen v0.2 to the new AgentChat API involves moving from history-driven, tightly coupled agents to modular, event-driven components that interact through async message streams. This guide provides concrete steps and code examples derived from the official migration documentation and source implementation to help you upgrade your codebase.

Architecture Overview: v0.2 vs. v0.4

The new design separates concerns across three distinct layers, moving from implicit coupling to explicit component dependencies.

Layer	v0.2 Implementation	v0.4 Implementation
Core API	Implicit and tightly coupled to `autogen.agentchat`	Exposed as `autogen_core` with abstractions like `ChatCompletionClient`, `Memory`, and `CancellationToken`
AgentChat API	Monolithic classes like `GroupChat` and `UserProxyAgent` in `autogen.agentchat`	Component-based agents in `autogen_agentchat` such as `AssistantAgent`, `RoundRobinGroupChat`, and `SelectorGroupChat`
Extensions	Scattered across `autogen.ext` with ad-hoc registration	Clean component model where each extension implements the Component protocol and loads via `load_component`
State Management	Manual export/import of `chat_messages` dictionaries	Uniform async `save_state` and `load_state` methods on both agents and teams

According to the source code in python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py (lines 70-88), an agent’s configuration is now a Pydantic model (AssistantAgentConfig) that can be serialized to JSON or YAML and reinstantiated later. This reflects the v0.4 principle of separating configuration data from runtime execution.

Core API Changes Required for Migration

Model Client Configuration

In v0.2, you used OpenAIWrapper with a config_list parameter that handled automatic failover between model endpoints. In v0.4, you must explicitly instantiate a ChatCompletionClient component.

The migration guide in python/docs/src/user-guide/agentchat-user-guide/migration-guide.md (lines 78-92) demonstrates that you now either construct the client directly or load it via the component system:

from autogen_core.models import ChatCompletionClient

config = {
    "provider": "OpenAIChatCompletionClient",
    "config": {
        "model": "gpt-4o",
        "api_key": "sk-xxxxxxxx"
    }
}
model_client = ChatCompletionClient.load_component(config)

This change gives you precise control over the model used and removes the hidden fallback logic present in v0.2.

Agent Creation Patterns

The AssistantAgent constructor signature has changed significantly. Instead of passing an llm_config dictionary, you now inject a model_client object as a first-class dependency.

v0.2: AssistantAgent(name, system_message, llm_config=...)
v0.4: AssistantAgent(name, system_message, model_client=...)

As implemented in microsoft/autogen, the agent no longer owns the LLM configuration; it receives an already configured client, enabling better testability and separation of concerns.

Async Message Handling

All agent interactions are now asynchronous. You must replace synchronous methods like agent.send() or initiate_chat() with async equivalents.

According to python/docs/src/user-guide/agentchat-user-guide/migration-guide.md (lines 96-107), use on_messages for standard requests or on_messages_stream for streaming responses:

from autogen_agentchat.messages import TextMessage
from autogen_core import CancellationToken

# Async interaction pattern

response = await assistant.on_messages(
    [TextMessage(content="Hello", source="user")],
    CancellationToken()
)

This async design enables non-blocking UI updates and proper streaming support.

Tool Calling and Reflection

The v0.2 pattern required a two-agent setup (caller plus executor) using register_function. In v0.4, a single AssistantAgent handles tool use directly through its tools parameter.

The migration guide (lines 66-71) shows that the model decides when to call tools, and you can enable a reflection pass via reflect_on_tool_use=True:

def get_weather(city: str) -> str:
    return f"The weather in {city} is 72°F."

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    model_client=model_client,
    tools=[get_weather],
    reflect_on_tool_use=True  # Optional second model pass for reflection

)

This simplifies orchestration and removes the need for separate proxy agents.

Group Chat Orchestration

Replace GroupChat and GroupChatManager with first-class team objects. The migration guide (lines 52-58) specifies two primary implementations:

RoundRobinGroupChat: Cycles through agents in fixed order
SelectorGroupChat: Uses a selector function or model to choose the next speaker

Teams are now agents themselves with termination conditions and can be run via run_stream() for async iteration.

State Persistence

Instead of manually dumping chat_messages, v0.4 provides uniform async save_state() and load_state() methods on both agents and teams.

As documented in python/docs/src/user-guide/agentchat-user-guide/migration-guide.md (lines 84-90):


# Save state

state = await team.save_state()
json.dump(state, open("state.json", "w"))

# Load state

await team.load_state(json.load(open("state.json")))

This works identically for individual AssistantAgent instances and group chat teams.

Caching Strategy

The implicit cache_seed mechanism from v0.2 is removed. You must now explicitly wrap your model client with ChatCompletionCache.

The migration guide (lines 71-88) demonstrates using DiskCacheStore or RedisStore:

from autogen_ext.models.cache import ChatCompletionCache
from autogen_ext.cache_store.diskcache import DiskCacheStore
from diskcache import Cache

cache = DiskCacheStore(Cache(directory))
cached_client = ChatCompletionCache(model_client, cache)

This gives you full control over cache scope, backend storage, and invalidation logic.

Memory Integration

The "teachable agent" pattern from v0.2 is replaced by the Memory protocol in autogen_core. You inject memory implementations (such as ChromaDBVectorMemory) directly into agents as shown in the migration documentation (lines 71-78).

This decouples retrieval logic from agent implementation, enabling any vector store backend.

Step-by-Step Migration Examples

Configuring the Model Client

Create a reusable model client component that can be shared across agents:

from autogen_ext.models.openai import OpenAIChatCompletionClient

client = OpenAIChatCompletionClient(
    model="gpt-4o",
    api_key="sk-xxx",
    temperature=0.7
)

Creating an AssistantAgent with Tools

Combine the model client with tools in a single agent definition:

from autogen_agentchat.agents import AssistantAgent

def calculate(expression: str) -> str:
    return str(eval(expression))

agent = AssistantAgent(
    name="math_assistant",
    system_message="You help with calculations.",
    model_client=client,
    tools=[calculate],
    reflect_on_tool_use=False
)

Setting Up Group Chat Teams

Configure multi-agent teams with explicit termination conditions:

from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination

writer = AssistantAgent(name="writer", system_message="Write content.", model_client=client)
critic = AssistantAgent(name="critic", system_message="Review content. Say 'DONE' when approved.", model_client=client)

termination = TextMentionTermination("DONE")
team = RoundRobinGroupChat([writer, critic], termination_condition=termination)

Persisting Agent and Team State

Implement checkpointing for long-running conversations:

import asyncio
import json

async def checkpoint_team(team, filename="checkpoint.json"):
    state = await team.save_state()
    with open(filename, "w") as f:
        json.dump(state, f)

async def restore_team(team, filename="checkpoint.json"):
    with open(filename) as f:
        state = json.load(f)
    await team.load_state(state)

Implementing Disk-Based Caching

Add caching to reduce API costs for repeated queries:

import tempfile
from autogen_ext.cache_store.diskcache import DiskCacheStore
from diskcache import Cache

with tempfile.TemporaryDirectory() as tmpdir:
    cache_store = DiskCacheStore(Cache(tmpdir))
    cached_client = ChatCompletionCache(client, cache_store)
    
    # Use cached_client in your agents

    agent = AssistantAgent(
        name="cached_assistant",
        model_client=cached_client,
        system_message="You are a helpful assistant."
    )

Summary

Explicit Model Clients: Replace OpenAIWrapper and llm_config with concrete ChatCompletionClient instances injected into agents.
Async-First Design: Refactor all agent interactions to use await with on_messages() or run_stream(), passing a CancellationToken.
Simplified Tool Use: Consolidate caller and executor into a single AssistantAgent with a tools list; optionally enable reflect_on_tool_use.
Team Abstractions: Migrate from GroupChat/GroupChatManager to RoundRobinGroupChat or SelectorGroupChat with explicit termination conditions.
Uniform State Management: Use async save_state() and load_state() instead of manual message history manipulation.
Explicit Caching: Wrap model clients with ChatCompletionCache and configure DiskCacheStore or RedisStore as needed.
Modular Memory: Inject Memory protocol implementations for RAG capabilities rather than using specialized teachable agents.

Frequently Asked Questions

What happened to the `config_list` parameter in AutoGen v0.4?

The config_list parameter used with OpenAIWrapper in v0.2 has been removed. In v0.4, you instantiate a specific ChatCompletionClient (such as OpenAIChatCompletionClient) directly or load it via load_component() from a configuration dictionary. This change eliminates hidden fallback logic and gives you explicit control over which model handles each request.

Do I need to rewrite all my code to use async/await?

Yes. The Core event-driven architecture in autogen_core requires all agent interactions to be asynchronous. You must replace synchronous calls like agent.send() or group_chat.run() with await agent.on_messages() or await team.run_stream(). This enables non-blocking execution and proper support for streaming responses.

How do I migrate my existing `GroupChat` with custom speaker selection?

Replace your v0.2 GroupChat and GroupChatManager with SelectorGroupChat from autogen_agentchat.teams. This class accepts a selector_func parameter or uses a model to determine the next speaker, replacing the string-based selection logic from v0.2. If you used round-robin behavior, use RoundRobinGroupChat instead, which requires no selector configuration.

Where is the automatic caching from v0.2?

Automatic caching via cache_seed is disabled by default in v0.4. To enable caching, wrap your model client with ChatCompletionCache and provide a store implementation such as DiskCacheStore or RedisStore. This explicit approach lets you control cache duration, scope, and backend storage rather than relying on implicit filesystem caching.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how microsoft/autogen works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →