architecture

How Mem0's Multi-Level Memory Architecture Manages User, Session, and Agent State

March 7, 2026 mem0ai/mem0 ↗

Mem0 enforces strict isolation between users, agents, and sessions by automatically injecting user_id, agent_id, and run_id into every memory operation through a centralized filtering system.

Mem0 (the mem0 library) implements a sophisticated multi-level memory architecture that scopes every stored fact to specific session identifiers. This design ensures that memories belonging to different users, AI agents, or conversation runs remain completely isolated while sharing the same underlying vector store infrastructure.

The Core Filtering Mechanism: `_build_filters_and_metadata`

At the heart of Mem0's session management lies the _build_filters_and_metadata helper function in mem0/memory/main.py. This utility constructs the metadata payload for storage and the query filters for retrieval based on the three primary session identifiers: user_id, agent_id, and run_id.


# mem0/memory/main.py

def _build_filters_and_metadata(
    *, user_id=None, agent_id=None, run_id=None,
    actor_id=None, input_metadata=None, input_filters=None,
) -> tuple[Dict[str, Any], Dict[str, Any]]:
    """
    Constructs metadata for storage and filters for querying based on session
    and actor identifiers.
    """
    base_metadata_template = deepcopy(input_metadata) if input_metadata else {}
    effective_query_filters = deepcopy(input_filters) if input_filters else {}

    # ----- add all provided session ids -----

    if user_id:
        base_metadata_template["user_id"] = user_id
        effective_query_filters["user_id"] = user_id
    if agent_id:
        base_metadata_template["agent_id"] = agent_id
        effective_query_filters["agent_id"] = agent_id
    if run_id:
        base_metadata_template["run_id"] = run_id
        effective_query_filters["run_id"] = run_id
    # … (validation & optional actor handling)

    …
    return base_metadata_template, effective_query_filters

The function returns two critical dictionaries:

base_metadata_template — Attached to new memories when storing data, ensuring the session context is permanently recorded.
effective_query_filters — Applied to vector store queries, guaranteeing that retrieval operations only return memories belonging to the specified session scope.

If no session identifier is supplied, the function raises a Mem0ValidationError, enforcing that every memory belongs to a well-defined scope.

Storing Session-Aware Memories

How `add()` Injects Session Context

The add() method serves as the primary entry point for creating memories. It invokes _build_filters_and_metadata to prepare the session context before passing data downstream.


# mem0/memory/main.py (excerpt)

def add(self, messages, *, user_id=None, agent_id=None, run_id=None,
        metadata=None, infer=True, memory_type=None, prompt=None):
    processed_metadata, effective_filters = _build_filters_and_metadata(
        user_id=user_id,
        agent_id=agent_id,
        run_id=run_id,
        input_metadata=metadata,
    )
    …
    # The rest of the workflow (fact extraction, vector store insert, etc.)

The processed_metadata dictionary flows into _add_to_vector_store, where each memory receives the session identifiers in its payload. When graph storage is enabled, the same filters are passed to _add_to_graph, which injects a default user_id="user" if the field is missing.

Persisting Identifiers in the Vector Store

During the final persistence step, _create_memory embeds the session context directly into the vector store payload:


# mem0/memory/main.py

def _create_memory(self, data, existing_embeddings, metadata=None):
    …
    metadata = metadata or {}
    metadata["data"] = data
    metadata["hash"] = hashlib.md5(data.encode()).hexdigest()
    metadata["created_at"] = datetime.now(pytz.timezone("US/Pacific")).isoformat()
    self.vector_store.insert(vectors=[embeddings], ids=[memory_id], payloads=[metadata])

Because metadata already contains user_id, agent_id, and/or run_id from the earlier filtering step, every vector store entry is permanently session-scoped.

Querying with Session Isolation

All read operations—including get, get_all, and search—leverage the same _build_filters_and_metadata function to construct query filters. For example, the search method composes filters that match the exact session identifiers provided:


# mem0/memory/main.py (search excerpt)

_, effective_filters = _build_filters_and_metadata(
    user_id=user_id, agent_id=agent_id, run_id=run_id, input_filters=filters
)
…
memories = self.vector_store.search(query=query, vectors=embeddings,
                                   limit=limit, filters=effective_filters)

The underlying vector store implementation—whether FAISS, Qdrant, Pinecone, or another backend—receives these filters, ensuring that only memories belonging to the same user, agent, or run are returned.

Agent vs. User Memory Extraction

Mem0 distinguishes between user-centric factual memories and agent-centric procedural memories through the _should_use_agent_memory_extraction method:

def _should_use_agent_memory_extraction(self, messages, metadata):
    # Use agent extraction if an agent_id is present AND there are assistant messages

    has_agent_id = metadata.get("agent_id") is not None
    has_assistant_messages = any(msg.get("role") == "assistant" for msg in messages)
    return has_agent_id and has_assistant_messages

When this check returns True, the LLM receives prompts optimized for extracting agent state—such as system instructions and tool usage patterns—rather than generic user facts. This branching logic ensures that agent state remains logically separated from user conversational history, even when both are stored in the same vector database.

Multi-Level Memory Storage Overview

Mem0's architecture organizes memory into four distinct levels, each respecting the same session scoping rules:

Short-term Memory

Storage: In-memory LLM processing (no persistence)
Scope: Raw messages during a single request, determined by the metadata passed to add

Long-term Vector Store

Storage: FAISS, Qdrant, Pinecone, or other backends (mem0/vector_stores/*)
Content: Embeddings plus payloads containing data, hash, timestamps, user_id, agent_id, run_id, actor_id, and custom metadata
Isolation: Enforced by filters built via _build_filters_and_metadata

Graph Store (Optional)

Storage: Neo4j, Neptune, or other graph databases (mem0/graphs/*)
Content: Structured relations between extracted entities
Scope: Same session filters apply, with a default user_id="user" when absent

Procedural Memory

Storage: Same vector store with memory_type="procedural"
Content: Conversation summaries and agent behavior patterns
Requirement: Tied to an agent_id; created via _create_procedural_memory

Practical Examples

Adding User-Scoped Memories

from mem0 import Memory

mem = Memory()                       # uses default config

result = mem.add(
    messages="I love hiking in the mountains.",
    user_id="user-42",
    metadata={"source": "chat"},
)
print(result)   # → {"results": [{"id": "...", "memory": "...", "event": "ADD"}]}

Under the hood, _build_filters_and_metadata adds "user_id": "user-42" to the metadata payload, and _create_memory stores this in the vector store. Any subsequent search(user_id="user-42") will exclusively retrieve this memory.

Adding Agent Procedural Memories

mem.add(
    messages=[
        {"role": "assistant", "content": "Sure, here's the plan..."},
        {"role": "assistant", "content": "Step 2: ..."},
    ],
    agent_id="weather-bot",
    run_id="run-2024-03-07-001",
    memory_type="procedural_memory",
)

Because memory_type is set to "procedural_memory", the library routes this to _create_procedural_memory, which generates an LLM summary and stores it with both agent_id and run_id attached. Later queries using search(agent_id="weather-bot") will surface this procedural context alongside other agent facts.

Querying with Mixed Identifiers


# Find memories belonging to a specific user AND run

results = mem.search(
    query="hiking",
    user_id="user-42",
    run_id="run-2024-03-07-001",
    limit=5,
)

The generated filter sent to the vector store contains:

{
  "user_id": "user-42",
  "run_id": "run-2024-03-07-001"
}

Only memories matching both constraints are returned, preventing cross-contamination between different conversation runs.

Summary

Centralized scoping: The _build_filters_and_metadata function in mem0/memory/main.py ensures every operation respects user_id, agent_id, and run_id.
Automatic validation: Mem0 raises Mem0ValidationError if no session identifiers are provided, guaranteeing memory isolation.
Storage agnostic: Whether using FAISS, Qdrant, Pinecone, or Neo4j, all backends receive the same session filters through the metadata payload.
Agent-aware extraction: The _should_use_agent_memory_extraction method routes agent conversations through specialized procedural memory extraction.
Zero cross-talk: Query filters strictly match stored metadata, ensuring users, agents, and runs remain isolated while sharing infrastructure.

Frequently Asked Questions

How does Mem0 prevent cross-talk between different users?

Mem0 prevents cross-talk by requiring user_id, agent_id, or run_id for every memory operation. The _build_filters_and_metadata function automatically injects these identifiers into both the storage metadata and query filters. When you call search(user_id="alice"), the vector store receives a filter {"user_id": "alice"}, ensuring only Alice's memories are retrieved.

What is the difference between run_id and agent_id in Mem0?

The agent_id identifies a persistent AI agent or bot (e.g., "weather-bot"), while the run_id identifies a specific conversation instance or execution context (e.g., "session-2024-03-07"). Multiple runs can belong to the same agent, allowing you to query either across all runs of an agent (using agent_id only) or within a specific session (using both agent_id and run_id).

Can I query memories across multiple sessions?

By default, Mem0 queries are scoped to the specific identifiers you provide. To query across sessions, you would omit the restrictive identifiers. For example, calling search(query="hiking", user_id="user-42") without specifying a run_id returns all memories for that user across every run. However, you cannot query across different user_id values within a single call, as this would violate the architectural isolation guarantees.

How does procedural memory differ from regular factual memory?

Procedural memory captures agent behavior patterns, system instructions, and conversation summaries, and requires an agent_id and the memory_type="procedural_memory" parameter. Factual memory stores user-specific facts extracted from conversations. The _should_use_agent_memory_extraction method determines which extraction path to use based on the presence of agent_id and assistant messages in the conversation history.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how mem0ai/mem0 works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →