Implementing the Society of Mind Agent Pattern in Multi-Agent Systems with AutoGen
The Society of Mind pattern in AutoGen uses the SocietyOfMindAgent class to wrap an inner team of collaborating agents, then synthesizes their discussion into a final response using a dedicated model client.
Implementing the Society of Mind agent pattern in multi-agent systems enables complex reasoning by treating a single high-level agent as a collaborative team of specialized subordinates. In the Microsoft AutoGen framework, this architectural pattern is realized through the SocietyOfMindAgent class, which cleanly separates collaborative problem-solving from final answer generation.
Core Architecture and Components
The implementation centers on the SocietyOfMindAgent class defined in [_society_of_mind_agent.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_society_of_mind_agent.py). This class inherits from BaseChatAgent ([_base_chat_agent.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_base_chat_agent.py)), ensuring it conforms to the standard chat agent contract with methods like on_messages and on_reset.
Key components include:
- Inner Team: Any
Teaminstance (typicallyRoundRobinGroupChatfrom [_round_robin_group_chat.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/teams/_group_chat/_round_robin_group_chat.py)) that executes the multi-turn dialogue among subordinate agents. - Model Client: A
ChatCompletionClientinstance that generates the final outer-world response by processing the aggregated inner-team transcript. - Model Context: An optional
ChatCompletionContext(such asBufferedChatCompletionContext) that stores recent messages to maintain token-efficient prompting, implemented in lines 47-54 of_society_of_mind_agent.py. - State Management: The
SocietyOfMindAgentStateclass ([_states.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/state/_states.py)) handles serialization of the inner team's state for checkpointing.
Execution Flow
When SocietyOfMindAgent receives a task, it orchestrates the inner team and synthesizes the result through the following steps defined in the on_messages_stream method:
- Task Ingestion: Wraps incoming messages into
task_messagesand combines them with any existing model context. - Inner Team Execution: Invokes
self._team.run_stream(..., output_task_messages=False)to execute the collaborative dialogue. All intermediate events are streamed, while the finalTaskResultis captured separately. - Message Collection: Non-
TaskResultevents (the conversation transcript) are yielded directly and collected for the final synthesis. - Response Generation:
- Prepends the configurable
DEFAULT_INSTRUCTIONandDEFAULT_RESPONSE_PROMPTto the transcript. - Converts each
BaseChatMessageto anLLMMessageusingmessage.to_model_message(). - Calls
self._model_client.createto generate the final response.
- Prepends the configurable
- Result Yielding: Returns a
Responseobject containing aTextMessagewith the LLM's completion. - Context Update: Adds all incoming messages to the model context via
_add_messages_to_contextfor future turns. - State Reset: Calls
self._team.reset()to ensure fresh state for the next outer-level task.
This flow cleanly separates collaborative reasoning (handled by the inner team) from response presentation (handled by the outer model client).
Implementation Examples
Basic Society of Mind with Writer and Reviewer
This example demonstrates a nested architecture where an inner team of writer and reviewer agents collaborate until approval, then a SocietyOfMindAgent synthesizes the final output:
from autogen_agentchat.agents import AssistantAgent, SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console
model_client = OpenAIChatCompletionClient(model="gpt-4o")
# Inner team: writer ↔ reviewer until “APPROVE” is seen
writer = AssistantAgent(
"writer",
model_client=model_client,
system_message="You are a writer. Produce a short paragraph.",
model_client_stream=True,
)
reviewer = AssistantAgent(
"reviewer",
model_client=model_client,
system_message="You are a reviewer. Critique the paragraph. Respond with 'APPROVE' when it is good enough.",
model_client_stream=True,
)
inner_termination = TextMentionTermination("APPROVE")
inner_team = RoundRobinGroupChat([writer, reviewer], termination_condition=inner_termination)
# Outer SocietyOfMindAgent – synthesises the inner discussion
society = SocietyOfMindAgent(
"society_of_mind",
team=inner_team,
model_client=model_client,
)
# Outer team that also translates the final answer
translator = AssistantAgent(
"translator",
model_client=model_client,
system_message="You translate English to Spanish.",
model_client_stream=True,
)
outer_team = RoundRobinGroupChat([society, translator], max_turns=2)
# Run
await Console(outer_team.run_stream(task="Write a brief description of AutoGen and translate it to Spanish."))
Key implementation details:
- The inner team handles the iterative writing and review process.
SocietyOfMindAgentreceives the full transcript, applies the default instruction and response prompt, and generates a concise final answer.- The outer team then passes this synthesized result to a translator agent.
Optimizing Token Usage with Custom Model Context
For long-running inner team discussions, use a buffered context to prevent token limit errors:
from autogen_core.model_context import BufferedChatCompletionContext
from autogen_agentchat.agents import SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(model="gpt-4o")
# Keep only the last 10 messages in the prompt
model_ctx = BufferedChatCompletionContext(max_messages=10)
society = SocietyOfMindAgent(
"society_of_mind",
team=RoundRobinGroupChat([...]), # any inner team
model_client=model_client,
model_context=model_ctx, # custom context for token efficiency
)
The BufferedChatCompletionContext (defined in autogen_core/model_context.py in the repository) automatically discards older messages while preserving recent reasoning context.
Checkpointing and State Restoration
To persist agent state across sessions:
# Capture current state including inner team progress
state = await society.save_state() # returns SocietyOfMindAgentState
# ... persist to disk or database ...
# Later, recreate and restore
new_society = SocietyOfMindAgent(
"society_of_mind",
team=RoundRobinGroupChat([...]),
model_client=model_client,
)
await new_society.load_state(state) # restores inner team state
State management uses the SocietyOfMindAgentState class from [_states.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/state/_states.py), enabling fault-tolerant multi-agent workflows.
Summary
- SocietyOfMindAgent wraps any AutoGen
Teamto create hierarchical agent societies that separate collaborative reasoning from final response generation. - The pattern implements a two-stage process: an inner team collaborates on the task, then a model client synthesizes the discussion into a coherent output.
- Token efficiency is managed through optional
ChatCompletionContextbuffers that limit prompt size during synthesis. - State persistence via
save_state()andload_state()enables checkpointing of complex inner-team discussions. - Source files are located in
autogen_agentchat/agents/_society_of_mind_agent.pyand related modules in the microsoft/autogen repository.
Frequently Asked Questions
How does SocietyOfMindAgent differ from a standard GroupChat?
A standard GroupChat exposes the internal conversation directly to the outer workflow. In contrast, SocietyOfMindAgent treats the inner team as a black box: it runs the team to completion, collects the full transcript, and uses a separate LLM call to generate a synthesized final response. This creates a cleaner abstraction boundary between collaborative reasoning and external communication.
Can I nest SocietyOfMindAgent inside another SocietyOfMindAgent?
Yes, the architecture supports arbitrary nesting. Since SocietyOfMindAgent inherits from BaseChatAgent and implements the standard agent interface, it can serve as a participant in any Team, including another SocietyOfMindAgent's inner team. This enables deeply hierarchical "societies of societies" for complex multi-level reasoning tasks.
What is the purpose of the model_context parameter?
The model_context parameter accepts a ChatCompletionContext instance that controls which messages are included when the outer LLM generates the final response. This prevents token limit errors during synthesis by automatically truncating or buffering the inner team's conversation history, ensuring the model client receives only the most relevant recent context.
How do I customize the synthesis prompt used by SocietyOfMindAgent?
While the class uses default constants DEFAULT_INSTRUCTION and DEFAULT_RESPONSE_PROMPT defined in _society_of_mind_agent.py, you can customize the synthesis behavior by subclassing SocietyOfMindAgent and overriding the response generation logic, or by modifying the prompt templates before instantiation. The synthesis process prepends the instruction to the inner team's message history and appends the response prompt before calling the model client's create method.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →