Implementing the Society of Mind Agent Pattern in Multi-Agent Systems with AutoGen

The Society of Mind pattern in AutoGen uses the SocietyOfMindAgent class to wrap an inner team of collaborating agents, then synthesizes their discussion into a final response using a dedicated model client.

Implementing the Society of Mind agent pattern in multi-agent systems enables complex reasoning by treating a single high-level agent as a collaborative team of specialized subordinates. In the Microsoft AutoGen framework, this architectural pattern is realized through the SocietyOfMindAgent class, which cleanly separates collaborative problem-solving from final answer generation.

Core Architecture and Components

The implementation centers on the SocietyOfMindAgent class defined in [_society_of_mind_agent.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_society_of_mind_agent.py). This class inherits from BaseChatAgent ([_base_chat_agent.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_base_chat_agent.py)), ensuring it conforms to the standard chat agent contract with methods like on_messages and on_reset.

Key components include:

Execution Flow

When SocietyOfMindAgent receives a task, it orchestrates the inner team and synthesizes the result through the following steps defined in the on_messages_stream method:

  1. Task Ingestion: Wraps incoming messages into task_messages and combines them with any existing model context.
  2. Inner Team Execution: Invokes self._team.run_stream(..., output_task_messages=False) to execute the collaborative dialogue. All intermediate events are streamed, while the final TaskResult is captured separately.
  3. Message Collection: Non-TaskResult events (the conversation transcript) are yielded directly and collected for the final synthesis.
  4. Response Generation:
    • Prepends the configurable DEFAULT_INSTRUCTION and DEFAULT_RESPONSE_PROMPT to the transcript.
    • Converts each BaseChatMessage to an LLMMessage using message.to_model_message().
    • Calls self._model_client.create to generate the final response.
  5. Result Yielding: Returns a Response object containing a TextMessage with the LLM's completion.
  6. Context Update: Adds all incoming messages to the model context via _add_messages_to_context for future turns.
  7. State Reset: Calls self._team.reset() to ensure fresh state for the next outer-level task.

This flow cleanly separates collaborative reasoning (handled by the inner team) from response presentation (handled by the outer model client).

Implementation Examples

Basic Society of Mind with Writer and Reviewer

This example demonstrates a nested architecture where an inner team of writer and reviewer agents collaborate until approval, then a SocietyOfMindAgent synthesizes the final output:

from autogen_agentchat.agents import AssistantAgent, SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console

model_client = OpenAIChatCompletionClient(model="gpt-4o")

# Inner team: writer ↔ reviewer until “APPROVE” is seen

writer = AssistantAgent(
    "writer",
    model_client=model_client,
    system_message="You are a writer. Produce a short paragraph.",
    model_client_stream=True,
)
reviewer = AssistantAgent(
    "reviewer",
    model_client=model_client,
    system_message="You are a reviewer. Critique the paragraph. Respond with 'APPROVE' when it is good enough.",
    model_client_stream=True,
)
inner_termination = TextMentionTermination("APPROVE")
inner_team = RoundRobinGroupChat([writer, reviewer], termination_condition=inner_termination)

# Outer SocietyOfMindAgent – synthesises the inner discussion

society = SocietyOfMindAgent(
    "society_of_mind",
    team=inner_team,
    model_client=model_client,
)

# Outer team that also translates the final answer

translator = AssistantAgent(
    "translator",
    model_client=model_client,
    system_message="You translate English to Spanish.",
    model_client_stream=True,
)
outer_team = RoundRobinGroupChat([society, translator], max_turns=2)

# Run

await Console(outer_team.run_stream(task="Write a brief description of AutoGen and translate it to Spanish."))

Key implementation details:

  • The inner team handles the iterative writing and review process.
  • SocietyOfMindAgent receives the full transcript, applies the default instruction and response prompt, and generates a concise final answer.
  • The outer team then passes this synthesized result to a translator agent.

Optimizing Token Usage with Custom Model Context

For long-running inner team discussions, use a buffered context to prevent token limit errors:

from autogen_core.model_context import BufferedChatCompletionContext
from autogen_agentchat.agents import SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(model="gpt-4o")

# Keep only the last 10 messages in the prompt

model_ctx = BufferedChatCompletionContext(max_messages=10)

society = SocietyOfMindAgent(
    "society_of_mind",
    team=RoundRobinGroupChat([...]),   # any inner team

    model_client=model_client,
    model_context=model_ctx,           # custom context for token efficiency

)

The BufferedChatCompletionContext (defined in autogen_core/model_context.py in the repository) automatically discards older messages while preserving recent reasoning context.

Checkpointing and State Restoration

To persist agent state across sessions:


# Capture current state including inner team progress

state = await society.save_state()       # returns SocietyOfMindAgentState

# ... persist to disk or database ...

# Later, recreate and restore

new_society = SocietyOfMindAgent(
    "society_of_mind",
    team=RoundRobinGroupChat([...]),
    model_client=model_client,
)
await new_society.load_state(state)      # restores inner team state

State management uses the SocietyOfMindAgentState class from [_states.py](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/state/_states.py), enabling fault-tolerant multi-agent workflows.

Summary

  • SocietyOfMindAgent wraps any AutoGen Team to create hierarchical agent societies that separate collaborative reasoning from final response generation.
  • The pattern implements a two-stage process: an inner team collaborates on the task, then a model client synthesizes the discussion into a coherent output.
  • Token efficiency is managed through optional ChatCompletionContext buffers that limit prompt size during synthesis.
  • State persistence via save_state() and load_state() enables checkpointing of complex inner-team discussions.
  • Source files are located in autogen_agentchat/agents/_society_of_mind_agent.py and related modules in the microsoft/autogen repository.

Frequently Asked Questions

How does SocietyOfMindAgent differ from a standard GroupChat?

A standard GroupChat exposes the internal conversation directly to the outer workflow. In contrast, SocietyOfMindAgent treats the inner team as a black box: it runs the team to completion, collects the full transcript, and uses a separate LLM call to generate a synthesized final response. This creates a cleaner abstraction boundary between collaborative reasoning and external communication.

Can I nest SocietyOfMindAgent inside another SocietyOfMindAgent?

Yes, the architecture supports arbitrary nesting. Since SocietyOfMindAgent inherits from BaseChatAgent and implements the standard agent interface, it can serve as a participant in any Team, including another SocietyOfMindAgent's inner team. This enables deeply hierarchical "societies of societies" for complex multi-level reasoning tasks.

What is the purpose of the model_context parameter?

The model_context parameter accepts a ChatCompletionContext instance that controls which messages are included when the outer LLM generates the final response. This prevents token limit errors during synthesis by automatically truncating or buffering the inner team's conversation history, ensuring the model client receives only the most relevant recent context.

How do I customize the synthesis prompt used by SocietyOfMindAgent?

While the class uses default constants DEFAULT_INSTRUCTION and DEFAULT_RESPONSE_PROMPT defined in _society_of_mind_agent.py, you can customize the synthesis behavior by subclassing SocietyOfMindAgent and overriding the response generation logic, or by modifying the prompt templates before instantiation. The synthesis process prepends the instruction to the inner team's message history and appends the response prompt before calling the model client's create method.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →