How to Implement Pause and Resume in Autogen Group Chats
To implement pause and resume functionality in Autogen group chats, implement the on_pause and on_resume methods in your agents, then call team.pause() and team.resume() to broadcast control events through the runtime.
The microsoft/autogen framework enables interactive control over multi-agent workflows through built-in pause and resume capabilities. This functionality allows you to suspend and resume group chat execution programmatically, which is essential for managing long-running or resource-intensive agent conversations. Understanding the internal architecture and agent responsibilities ensures you implement robust pause-aware agents that respond correctly to control signals.
Architecture of Pause and Resume
The pause and resume system relies on a distributed event model where the BaseGroupChat coordinates with individual agents through the AgentRuntime. Understanding these components is crucial for implementing reliable control flows.
Core Components
The implementation spans four key files in the autogen_agentchat package:
BaseGroupChat(python/packages/autogen-agentchat/src/autogen_agentchat/teams/_group_chat/_base_group_chat.py): Implements the high-levelpause()andresume()methods and manages the team lifecycleGroupChatPauseandGroupChatResume(python/packages/autogen-agentchat/src/autogen_agentchat/teams/_group_chat/_events.py): Pydantic event models transmitted via RPC to signal state changesBaseChatAgent(python/packages/autogen-agentchat/src/autogen_agentchat/base/_chat_agent.py): Abstract base class defining theon_pauseandon_resumehooks that agents must implement_chat_agent_container.py(python/packages/autogen-agentchat/src/autogen_agentchat/teams/_group_chat/_chat_agent_container.py): Routes incoming pause/resume events to the concrete agent implementations
How Pause Works
When you invoke team.pause(), the BaseGroupChat executes a precise sequence defined in lines 88–100 of _base_group_chat.py:
- Initialization Check: The method verifies
self._initializedis true, raising aRuntimeErrorif the team has not been started (lines 88–90) - Participant Notification: Iterates through
self._participant_topic_typesand sendsGroupChatPause()events to each participant viaruntime.send_message()(lines 91–96) - Manager Notification: Sends a
GroupChatPause()event to the group chat manager itself (lines 97–100) - Agent Handling: Each participant receives the event in its container (
_chat_agent_container.pylines 177–183) and forwards the call to the agent'son_pausemethod
How Resume Works
The resume() method mirrors the pause logic exactly (lines 36–46 of the resume method in _base_group_chat.py), broadcasting GroupChatResume events instead. The container code at line 192 of _chat_agent_container.py invokes the agent's on_resume method when the event arrives.
Implementing Pause-Aware Agents
Agents must explicitly implement on_pause and on_resume to react to control signals. The framework only notifies agents; suspending actual work is the agent's responsibility.
Minimal Agent Implementation
Here is a complete CounterAgent that demonstrates proper pause/resume handling by checking an internal flag during long-running operations:
import asyncio
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.messages import TextMessage
from autogen_core import CancellationToken
class CounterAgent(BaseChatAgent):
def __init__(self, name: str):
super().__init__(name=name, description="Counts while not paused")
self._paused = False
self.counter = 0
@property
def produced_message_types(self):
return [TextMessage]
async def on_messages(self, messages, cancellation_token: CancellationToken):
# Simulate work that respects the pause flag
while not self._paused:
await asyncio.sleep(0.1)
self.counter += 1
# Return an empty TextMessage to keep the contract
return Response(chat_message=TextMessage(source=self.name, content=""))
async def on_pause(self, cancellation_token: CancellationToken):
self._paused = True
async def on_resume(self, cancellation_token: CancellationToken):
self._paused = False
async def close(self):
# Clean‑up if needed
pass
If an agent does not implement these methods, pause and resume calls become no-ops, which is the intended behavior for agents that do not require granular control.
Controlling Group Chat Execution
Use the RoundRobinGroupChat or other BaseGroupChat subclasses with a runtime to orchestrate pause and resume operations.
Practical Usage Example
This example demonstrates running a team, pausing execution to halt agent processing, and resuming to continue the workflow:
import asyncio
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_core import SingleThreadedAgentRuntime
async def demo():
runtime = SingleThreadedAgentRuntime()
runtime.start()
agent = CounterAgent(name="counter")
team = RoundRobinGroupChat([agent], runtime=runtime, max_turns=5)
# Run the team in the background
team_task = asyncio.create_task(team.run())
# Let the agent count for a moment
await asyncio.sleep(1)
print("counter =", agent.counter) # → > 0
# Pause the chat – the agent should stop counting
await team.pause()
await asyncio.sleep(1)
print("still paused, counter =", agent.counter) # → unchanged
# Resume the chat – counting continues
await team.resume()
await asyncio.sleep(1)
print("after resume, counter =", agent.counter) # → increased
# Shut down
await team.terminate()
await runtime.stop()
await team_task
asyncio.run(demo())
Note that pause and resume are ephemeral operations—they do not alter the team's saved state. To persist progress across sessions, call team.save_state() before pausing and team.load_state() after restarting.
Testing and Verification
The official test suite in python/packages/autogen-agentchat/tests/test_group_chat_pause_resume.py validates the exact behavior described above. The test verifies that:
assert curr_counter == agent.counterimmediately after pause (line 21), confirming the agent stops making progressassert curr_counter < agent.counterafter resume (line 36), confirming the agent resumes processing
You should implement similar assertions in your own test suites to ensure your agents correctly respect pause boundaries.
Summary
- BaseGroupChat provides the
pause()andresume()API that broadcasts control events to all participants - GroupChatPause and GroupChatResume events are transmitted via the
AgentRuntimeto every agent and the manager - Agent responsibility: Implement
on_pauseandon_resumein yourBaseChatAgentsubclasses to set internal flags that suspend active processing - State persistence: Pause and resume do not affect saved state; use
save_state()andload_state()for persistence - Error handling: Calling
pause()beforerun()raises aRuntimeErrordue to the initialization check at lines 88–90 of_base_group_chat.py
Frequently Asked Questions
What happens if I call pause() before starting the team?
Calling team.pause() before team.run() raises a RuntimeError because the BaseGroupChat checks self._initialized at lines 88–90 of _base_group_chat.py. You must initialize the team by calling run() first to establish the runtime connections and participant topics.
Do all agents need to implement on_pause and on_resume?
No, these methods are optional. If an agent does not implement on_pause and on_resume, the pause and resume calls become no-ops for that specific agent, as noted in the docstring at line 70 of _base_group_chat.py. Only agents with long-running operations or external resource management need to implement these hooks.
Does pausing a group chat affect its saved state?
No, pause and resume are ephemeral operations that only send control signals through the runtime. They do not modify the internal state representation used by save_state() and load_state(). To preserve progress across application restarts, explicitly call save_state() before shutting down and load_state() when reconstructing the team.
How does the runtime deliver pause events to agents?
The AgentRuntime uses RPC-style message passing. When pause() is called, BaseGroupChat sends GroupChatPause events to each participant's AgentId and to the manager. The _chat_agent_container.py receives these messages and forwards them to the concrete agent's on_pause method at lines 177–183, ensuring each agent processes the signal within its own execution context.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →