Migrating from AutoGen v0.2 to the New AgentChat API: A Complete Guide
AutoGen v0.4 replaces the monolithic, synchronous agent classes of v0.2 with an async, component-based AgentChat API built on top of the Core event-driven framework, requiring explicit model clients, async message handling, and new state persistence patterns.
The microsoft/autogen repository has undergone a fundamental architectural redesign in version 0.4. The migration from AutoGen v0.2 to the new AgentChat API involves moving from history-driven, tightly coupled agents to modular, event-driven components that interact through async message streams. This guide provides concrete steps and code examples derived from the official migration documentation and source implementation to help you upgrade your codebase.
Architecture Overview: v0.2 vs. v0.4
The new design separates concerns across three distinct layers, moving from implicit coupling to explicit component dependencies.
| Layer | v0.2 Implementation | v0.4 Implementation |
|---|---|---|
| Core API | Implicit and tightly coupled to autogen.agentchat |
Exposed as autogen_core with abstractions like ChatCompletionClient, Memory, and CancellationToken |
| AgentChat API | Monolithic classes like GroupChat and UserProxyAgent in autogen.agentchat |
Component-based agents in autogen_agentchat such as AssistantAgent, RoundRobinGroupChat, and SelectorGroupChat |
| Extensions | Scattered across autogen.ext with ad-hoc registration |
Clean component model where each extension implements the Component protocol and loads via load_component |
| State Management | Manual export/import of chat_messages dictionaries |
Uniform async save_state and load_state methods on both agents and teams |
According to the source code in python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py (lines 70-88), an agent’s configuration is now a Pydantic model (AssistantAgentConfig) that can be serialized to JSON or YAML and reinstantiated later. This reflects the v0.4 principle of separating configuration data from runtime execution.
Core API Changes Required for Migration
Model Client Configuration
In v0.2, you used OpenAIWrapper with a config_list parameter that handled automatic failover between model endpoints. In v0.4, you must explicitly instantiate a ChatCompletionClient component.
The migration guide in python/docs/src/user-guide/agentchat-user-guide/migration-guide.md (lines 78-92) demonstrates that you now either construct the client directly or load it via the component system:
from autogen_core.models import ChatCompletionClient
config = {
"provider": "OpenAIChatCompletionClient",
"config": {
"model": "gpt-4o",
"api_key": "sk-xxxxxxxx"
}
}
model_client = ChatCompletionClient.load_component(config)
This change gives you precise control over the model used and removes the hidden fallback logic present in v0.2.
Agent Creation Patterns
The AssistantAgent constructor signature has changed significantly. Instead of passing an llm_config dictionary, you now inject a model_client object as a first-class dependency.
- v0.2:
AssistantAgent(name, system_message, llm_config=...) - v0.4:
AssistantAgent(name, system_message, model_client=...)
As implemented in microsoft/autogen, the agent no longer owns the LLM configuration; it receives an already configured client, enabling better testability and separation of concerns.
Async Message Handling
All agent interactions are now asynchronous. You must replace synchronous methods like agent.send() or initiate_chat() with async equivalents.
According to python/docs/src/user-guide/agentchat-user-guide/migration-guide.md (lines 96-107), use on_messages for standard requests or on_messages_stream for streaming responses:
from autogen_agentchat.messages import TextMessage
from autogen_core import CancellationToken
# Async interaction pattern
response = await assistant.on_messages(
[TextMessage(content="Hello", source="user")],
CancellationToken()
)
This async design enables non-blocking UI updates and proper streaming support.
Tool Calling and Reflection
The v0.2 pattern required a two-agent setup (caller plus executor) using register_function. In v0.4, a single AssistantAgent handles tool use directly through its tools parameter.
The migration guide (lines 66-71) shows that the model decides when to call tools, and you can enable a reflection pass via reflect_on_tool_use=True:
def get_weather(city: str) -> str:
return f"The weather in {city} is 72°F."
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
model_client=model_client,
tools=[get_weather],
reflect_on_tool_use=True # Optional second model pass for reflection
)
This simplifies orchestration and removes the need for separate proxy agents.
Group Chat Orchestration
Replace GroupChat and GroupChatManager with first-class team objects. The migration guide (lines 52-58) specifies two primary implementations:
- RoundRobinGroupChat: Cycles through agents in fixed order
- SelectorGroupChat: Uses a selector function or model to choose the next speaker
Teams are now agents themselves with termination conditions and can be run via run_stream() for async iteration.
State Persistence
Instead of manually dumping chat_messages, v0.4 provides uniform async save_state() and load_state() methods on both agents and teams.
As documented in python/docs/src/user-guide/agentchat-user-guide/migration-guide.md (lines 84-90):
# Save state
state = await team.save_state()
json.dump(state, open("state.json", "w"))
# Load state
await team.load_state(json.load(open("state.json")))
This works identically for individual AssistantAgent instances and group chat teams.
Caching Strategy
The implicit cache_seed mechanism from v0.2 is removed. You must now explicitly wrap your model client with ChatCompletionCache.
The migration guide (lines 71-88) demonstrates using DiskCacheStore or RedisStore:
from autogen_ext.models.cache import ChatCompletionCache
from autogen_ext.cache_store.diskcache import DiskCacheStore
from diskcache import Cache
cache = DiskCacheStore(Cache(directory))
cached_client = ChatCompletionCache(model_client, cache)
This gives you full control over cache scope, backend storage, and invalidation logic.
Memory Integration
The "teachable agent" pattern from v0.2 is replaced by the Memory protocol in autogen_core. You inject memory implementations (such as ChromaDBVectorMemory) directly into agents as shown in the migration documentation (lines 71-78).
This decouples retrieval logic from agent implementation, enabling any vector store backend.
Step-by-Step Migration Examples
Configuring the Model Client
Create a reusable model client component that can be shared across agents:
from autogen_ext.models.openai import OpenAIChatCompletionClient
client = OpenAIChatCompletionClient(
model="gpt-4o",
api_key="sk-xxx",
temperature=0.7
)
Creating an AssistantAgent with Tools
Combine the model client with tools in a single agent definition:
from autogen_agentchat.agents import AssistantAgent
def calculate(expression: str) -> str:
return str(eval(expression))
agent = AssistantAgent(
name="math_assistant",
system_message="You help with calculations.",
model_client=client,
tools=[calculate],
reflect_on_tool_use=False
)
Setting Up Group Chat Teams
Configure multi-agent teams with explicit termination conditions:
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
writer = AssistantAgent(name="writer", system_message="Write content.", model_client=client)
critic = AssistantAgent(name="critic", system_message="Review content. Say 'DONE' when approved.", model_client=client)
termination = TextMentionTermination("DONE")
team = RoundRobinGroupChat([writer, critic], termination_condition=termination)
Persisting Agent and Team State
Implement checkpointing for long-running conversations:
import asyncio
import json
async def checkpoint_team(team, filename="checkpoint.json"):
state = await team.save_state()
with open(filename, "w") as f:
json.dump(state, f)
async def restore_team(team, filename="checkpoint.json"):
with open(filename) as f:
state = json.load(f)
await team.load_state(state)
Implementing Disk-Based Caching
Add caching to reduce API costs for repeated queries:
import tempfile
from autogen_ext.cache_store.diskcache import DiskCacheStore
from diskcache import Cache
with tempfile.TemporaryDirectory() as tmpdir:
cache_store = DiskCacheStore(Cache(tmpdir))
cached_client = ChatCompletionCache(client, cache_store)
# Use cached_client in your agents
agent = AssistantAgent(
name="cached_assistant",
model_client=cached_client,
system_message="You are a helpful assistant."
)
Summary
- Explicit Model Clients: Replace
OpenAIWrapperandllm_configwith concreteChatCompletionClientinstances injected into agents. - Async-First Design: Refactor all agent interactions to use
awaitwithon_messages()orrun_stream(), passing aCancellationToken. - Simplified Tool Use: Consolidate caller and executor into a single
AssistantAgentwith atoolslist; optionally enablereflect_on_tool_use. - Team Abstractions: Migrate from
GroupChat/GroupChatManagertoRoundRobinGroupChatorSelectorGroupChatwith explicit termination conditions. - Uniform State Management: Use async
save_state()andload_state()instead of manual message history manipulation. - Explicit Caching: Wrap model clients with
ChatCompletionCacheand configureDiskCacheStoreorRedisStoreas needed. - Modular Memory: Inject
Memoryprotocol implementations for RAG capabilities rather than using specialized teachable agents.
Frequently Asked Questions
What happened to the config_list parameter in AutoGen v0.4?
The config_list parameter used with OpenAIWrapper in v0.2 has been removed. In v0.4, you instantiate a specific ChatCompletionClient (such as OpenAIChatCompletionClient) directly or load it via load_component() from a configuration dictionary. This change eliminates hidden fallback logic and gives you explicit control over which model handles each request.
Do I need to rewrite all my code to use async/await?
Yes. The Core event-driven architecture in autogen_core requires all agent interactions to be asynchronous. You must replace synchronous calls like agent.send() or group_chat.run() with await agent.on_messages() or await team.run_stream(). This enables non-blocking execution and proper support for streaming responses.
How do I migrate my existing GroupChat with custom speaker selection?
Replace your v0.2 GroupChat and GroupChatManager with SelectorGroupChat from autogen_agentchat.teams. This class accepts a selector_func parameter or uses a model to determine the next speaker, replacing the string-based selection logic from v0.2. If you used round-robin behavior, use RoundRobinGroupChat instead, which requires no selector configuration.
Where is the automatic caching from v0.2?
Automatic caching via cache_seed is disabled by default in v0.4. To enable caching, wrap your model client with ChatCompletionCache and provide a store implementation such as DiskCacheStore or RedisStore. This explicit approach lets you control cache duration, scope, and backend storage rather than relying on implicit filesystem caching.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →