# Implementing the Society of Mind Agent Pattern in Multi-Agent Systems with AutoGen

> Implement the Society of Mind agent pattern in multi-agent systems using AutoGen. Learn how SocietyOfMindAgent wraps teams and synthesizes discussions for better collaboration.

- Repository: [Microsoft/autogen](https://github.com/microsoft/autogen)
- Tags: how-to-guide
- Published: 2026-03-07

---

**The Society of Mind pattern in AutoGen uses the `SocietyOfMindAgent` class to wrap an inner team of collaborating agents, then synthesizes their discussion into a final response using a dedicated model client.**

Implementing the Society of Mind agent pattern in multi-agent systems enables complex reasoning by treating a single high-level agent as a collaborative team of specialized subordinates. In the Microsoft AutoGen framework, this architectural pattern is realized through the `SocietyOfMindAgent` class, which cleanly separates collaborative problem-solving from final answer generation.

## Core Architecture and Components

The implementation centers on the `SocietyOfMindAgent` class defined in [[`_society_of_mind_agent.py`](https://github.com/microsoft/autogen/blob/main/_society_of_mind_agent.py)](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_society_of_mind_agent.py). This class inherits from `BaseChatAgent` ([[`_base_chat_agent.py`](https://github.com/microsoft/autogen/blob/main/_base_chat_agent.py)](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_base_chat_agent.py)), ensuring it conforms to the standard chat agent contract with methods like `on_messages` and `on_reset`.

Key components include:

- **Inner Team**: Any `Team` instance (typically `RoundRobinGroupChat` from [[`_round_robin_group_chat.py`](https://github.com/microsoft/autogen/blob/main/_round_robin_group_chat.py)](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/teams/_group_chat/_round_robin_group_chat.py)) that executes the multi-turn dialogue among subordinate agents.
- **Model Client**: A `ChatCompletionClient` instance that generates the final outer-world response by processing the aggregated inner-team transcript.
- **Model Context**: An optional `ChatCompletionContext` (such as `BufferedChatCompletionContext`) that stores recent messages to maintain token-efficient prompting, implemented in lines 47-54 of [`_society_of_mind_agent.py`](https://github.com/microsoft/autogen/blob/main/_society_of_mind_agent.py).
- **State Management**: The `SocietyOfMindAgentState` class ([[`_states.py`](https://github.com/microsoft/autogen/blob/main/_states.py)](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/state/_states.py)) handles serialization of the inner team's state for checkpointing.

## Execution Flow

When `SocietyOfMindAgent` receives a task, it orchestrates the inner team and synthesizes the result through the following steps defined in the `on_messages_stream` method:

1. **Task Ingestion**: Wraps incoming messages into `task_messages` and combines them with any existing model context.
2. **Inner Team Execution**: Invokes `self._team.run_stream(..., output_task_messages=False)` to execute the collaborative dialogue. All intermediate events are streamed, while the final `TaskResult` is captured separately.
3. **Message Collection**: Non-`TaskResult` events (the conversation transcript) are yielded directly and collected for the final synthesis.
4. **Response Generation**: 
   - Prepends the configurable `DEFAULT_INSTRUCTION` and `DEFAULT_RESPONSE_PROMPT` to the transcript.
   - Converts each `BaseChatMessage` to an `LLMMessage` using `message.to_model_message()`.
   - Calls `self._model_client.create` to generate the final response.
5. **Result Yielding**: Returns a `Response` object containing a `TextMessage` with the LLM's completion.
6. **Context Update**: Adds all incoming messages to the model context via `_add_messages_to_context` for future turns.
7. **State Reset**: Calls `self._team.reset()` to ensure fresh state for the next outer-level task.

This flow cleanly separates **collaborative reasoning** (handled by the inner team) from **response presentation** (handled by the outer model client).

## Implementation Examples

### Basic Society of Mind with Writer and Reviewer

This example demonstrates a nested architecture where an inner team of writer and reviewer agents collaborate until approval, then a `SocietyOfMindAgent` synthesizes the final output:

```python
from autogen_agentchat.agents import AssistantAgent, SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console

model_client = OpenAIChatCompletionClient(model="gpt-4o")

# Inner team: writer ↔ reviewer until “APPROVE” is seen

writer = AssistantAgent(
    "writer",
    model_client=model_client,
    system_message="You are a writer. Produce a short paragraph.",
    model_client_stream=True,
)
reviewer = AssistantAgent(
    "reviewer",
    model_client=model_client,
    system_message="You are a reviewer. Critique the paragraph. Respond with 'APPROVE' when it is good enough.",
    model_client_stream=True,
)
inner_termination = TextMentionTermination("APPROVE")
inner_team = RoundRobinGroupChat([writer, reviewer], termination_condition=inner_termination)

# Outer SocietyOfMindAgent – synthesises the inner discussion

society = SocietyOfMindAgent(
    "society_of_mind",
    team=inner_team,
    model_client=model_client,
)

# Outer team that also translates the final answer

translator = AssistantAgent(
    "translator",
    model_client=model_client,
    system_message="You translate English to Spanish.",
    model_client_stream=True,
)
outer_team = RoundRobinGroupChat([society, translator], max_turns=2)

# Run

await Console(outer_team.run_stream(task="Write a brief description of AutoGen and translate it to Spanish."))

```

*Key implementation details:*
- The **inner team** handles the iterative writing and review process.
- `SocietyOfMindAgent` receives the full transcript, applies the default instruction and response prompt, and generates a concise final answer.
- The **outer team** then passes this synthesized result to a translator agent.

### Optimizing Token Usage with Custom Model Context

For long-running inner team discussions, use a buffered context to prevent token limit errors:

```python
from autogen_core.model_context import BufferedChatCompletionContext
from autogen_agentchat.agents import SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(model="gpt-4o")

# Keep only the last 10 messages in the prompt

model_ctx = BufferedChatCompletionContext(max_messages=10)

society = SocietyOfMindAgent(
    "society_of_mind",
    team=RoundRobinGroupChat([...]),   # any inner team

    model_client=model_client,
    model_context=model_ctx,           # custom context for token efficiency

)

```

The `BufferedChatCompletionContext` (defined in [`autogen_core/model_context.py`](https://github.com/microsoft/autogen/blob/main/autogen_core/model_context.py) in the repository) automatically discards older messages while preserving recent reasoning context.

### Checkpointing and State Restoration

To persist agent state across sessions:

```python

# Capture current state including inner team progress

state = await society.save_state()       # returns SocietyOfMindAgentState

# ... persist to disk or database ...

# Later, recreate and restore

new_society = SocietyOfMindAgent(
    "society_of_mind",
    team=RoundRobinGroupChat([...]),
    model_client=model_client,
)
await new_society.load_state(state)      # restores inner team state

```

State management uses the `SocietyOfMindAgentState` class from [[`_states.py`](https://github.com/microsoft/autogen/blob/main/_states.py)](https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/state/_states.py), enabling fault-tolerant multi-agent workflows.

## Summary

- **SocietyOfMindAgent** wraps any AutoGen `Team` to create hierarchical agent societies that separate collaborative reasoning from final response generation.
- The pattern implements a **two-stage process**: an inner team collaborates on the task, then a model client synthesizes the discussion into a coherent output.
- **Token efficiency** is managed through optional `ChatCompletionContext` buffers that limit prompt size during synthesis.
- **State persistence** via `save_state()` and `load_state()` enables checkpointing of complex inner-team discussions.
- Source files are located in [`autogen_agentchat/agents/_society_of_mind_agent.py`](https://github.com/microsoft/autogen/blob/main/autogen_agentchat/agents/_society_of_mind_agent.py) and related modules in the [microsoft/autogen](https://github.com/microsoft/autogen) repository.

## Frequently Asked Questions

### How does SocietyOfMindAgent differ from a standard GroupChat?

A standard `GroupChat` exposes the internal conversation directly to the outer workflow. In contrast, `SocietyOfMindAgent` treats the inner team as a black box: it runs the team to completion, collects the full transcript, and uses a separate LLM call to generate a synthesized final response. This creates a cleaner abstraction boundary between collaborative reasoning and external communication.

### Can I nest SocietyOfMindAgent inside another SocietyOfMindAgent?

Yes, the architecture supports arbitrary nesting. Since `SocietyOfMindAgent` inherits from `BaseChatAgent` and implements the standard agent interface, it can serve as a participant in any `Team`, including another `SocietyOfMindAgent`'s inner team. This enables deeply hierarchical "societies of societies" for complex multi-level reasoning tasks.

### What is the purpose of the model_context parameter?

The `model_context` parameter accepts a `ChatCompletionContext` instance that controls which messages are included when the outer LLM generates the final response. This prevents token limit errors during synthesis by automatically truncating or buffering the inner team's conversation history, ensuring the model client receives only the most relevant recent context.

### How do I customize the synthesis prompt used by SocietyOfMindAgent?

While the class uses default constants `DEFAULT_INSTRUCTION` and `DEFAULT_RESPONSE_PROMPT` defined in [`_society_of_mind_agent.py`](https://github.com/microsoft/autogen/blob/main/_society_of_mind_agent.py), you can customize the synthesis behavior by subclassing `SocietyOfMindAgent` and overriding the response generation logic, or by modifying the prompt templates before instantiation. The synthesis process prepends the instruction to the inner team's message history and appends the response prompt before calling the model client's `create` method.