How to Integrate MCP Memory Service with LangGraph for Persistent Agent State

You integrate MCP Memory Service with LangGraph by running the standalone HTTP server and wiring its REST endpoints into your agents as LangChain tools or explicit StateGraph nodes, replacing the ephemeral MemorySaver with durable, cross-graph persistent storage.

The doobidoo/mcp-memory-service repository provides a standalone memory store that exposes a REST API for creating, searching, and managing long-term memories. To integrate MCP Memory Service with LangGraph for agent state persistence, you route memory operations through the service's HTTP endpoints instead of relying on LangGraph's in-process MemorySaver, which loses state when the graph process terminates.

Why LangGraph Needs External Memory Persistence

LangGraph's built-in MemorySaver keeps state only inside a single StateGraph process, making it unsuitable for durable persistence or multi-agent collaboration. MCP Memory Service solves these limitations by running as a standalone HTTP server with a SQLite or database backend.

Key differences between LangGraph MemorySaver and MCP Memory Service:

  • Persistence across runs: LangGraph's saver is ephemeral (process-bound), while MCP Memory Service writes to disk/DB for durability.
  • Sharing between StateGraphs: LangGraph isolates memory per process; the service enables cross-graph sharing via network requests.
  • Scalability: LangGraph is limited to in-process only; the service is network-accessible for multi-host deployments.
  • Metadata scoping: LangGraph offers limited tagging; the service supports rich metadata including tags, conversation_id, and memory_type.

Start the MCP Memory Service Server

Begin by installing the package and launching the HTTP server. According to the integration guide in docs/agents/langgraph.md, the server exposes a REST API on port 8000 by default.

pip install mcp-memory-service httpx
MCP_ALLOW_ANONYMOUS_ACCESS=true memory server --http

The memory server --http command (defined in run_server.py) starts the service. Set MCP_ALLOW_ANONYMOUS_ACCESS=true for local development to skip authentication.

Method 1: Memory Tools for ReAct Agents

The fastest way to add persistence is by wrapping MCP Memory Service calls as LangChain tools and passing them to LangGraph's prebuilt ReAct agent constructor.

Define search_memory and store_memory tools that call the service endpoints:

import httpx
from langchain_core.tools import tool

MEMORY_URL = "http://localhost:8000"

@tool
async def search_memory(query: str) -> str:
    """Search long-term memory for relevant context."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{MEMORY_URL}/api/memories/search",
            json={"query": query, "limit": 5},
        )
        memories = response.json()["memories"]
        if not memories:
            return "No relevant memories found."
        return "\n".join(f"- {m['content']}" for m in memories)

@tool
async def store_memory(content: str, tags: list[str] = None) -> str:
    """Store a new memory for future retrieval."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{MEMORY_URL}/api/memories",
            json={"content": content, "tags": tags or []},
        )
        result = response.json()
        return f"Stored memory: {result.get('content_hash', 'unknown')}"

Pass these tools to create_react_agent as shown in docs/agents/langgraph.md:

from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-6")
agent = create_react_agent(
    llm,
    tools=[search_memory, store_memory],
    state_modifier="You have access to long-term memory. Search memory before answering. Store important findings."
)

The agent now automatically calls search_memory before answering and store_memory to persist findings, with all data stored durably in the MCP service.

Method 2: Custom Memory Nodes in StateGraph

For fine-grained control over when and how memory is retrieved or stored, define explicit nodes in a custom StateGraph. This pattern, documented in docs/agents/langgraph.md, lets you inject memory into system prompts and tag entries per-agent.

Define your state schema and memory nodes:

import httpx
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_anthropic import ChatAnthropic

MEMORY_URL = "http://localhost:8000"

class AgentState(TypedDict):
    messages: Annotated[list, "message history"]
    memory_context: str
    agent_id: str

async def retrieve_memory_node(state: AgentState) -> dict:
    """Pull relevant memories before the LLM call."""
    last_message = state["messages"][-1]
    query = last_message.content if hasattr(last_message, "content") else str(last_message)

    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{MEMORY_URL}/api/memories/search",
            json={
                "query": query,
                "limit": 5,
                "tags": [f"agent:{state['agent_id']}"],  # per-agent scoping

            },
        )
        memories = response.json().get("memories", [])

    context = (
        "Relevant memory:\n" + "\n".join(f"- {m['content']}" for m in memories)
        if memories else ""
    )
    return {"memory_context": context}

async def llm_node(state: AgentState) -> dict:
    """Call the LLM with the injected memory context."""
    llm = ChatAnthropic(model="claude-sonnet-4-6")
    system_parts = ["You are a helpful assistant."]
    if state.get("memory_context"):
        system_parts.append(state["memory_context"])

    messages = [SystemMessage(content="\n\n".join(system_parts))] + state["messages"]
    response = await llm.ainvoke(messages)
    return {"messages": state["messages"] + [response]}

async def store_memory_node(state: AgentState) -> dict:
    """Persist the LLM's response for later reuse."""
    last_response = state["messages"][-1]
    content = last_response.content if hasattr(last_response, "content") else str(last_response)

    async with httpx.AsyncClient() as client:
        await client.post(
            f"{MEMORY_URL}/api/memories",
            json={
                "content": content[:500],  # store a concise summary

                "tags": [f"agent:{state['agent_id']}", "llm-response"],
                "memory_type": "observation",
            },
            headers={"X-Agent-ID": state["agent_id"]},
        )
    return {}

Compile the graph with explicit edges:

graph = StateGraph(AgentState)
graph.add_node("retrieve_memory", retrieve_memory_node)
graph.add_node("llm", llm_node)
graph.add_node("store_memory", store_memory_node)

graph.set_entry_point("retrieve_memory")
graph.add_edge("retrieve_memory", "llm")
graph.add_edge("llm", "store_memory")
graph.add_edge("store_memory", END)

agent = graph.compile()

This implementation stores memories tagged with the specific agent_id, enabling precise scoping and retrieval control.

Cross-Graph Memory Sharing

Because all graphs communicate with the same HTTP endpoint, you can share memories between different StateGraph instances by using shared tags. As implemented in docs/agents/langgraph.md, one agent can write memories that another reads.


# Researcher graph stores memories tagged "agent:researcher"

researcher_result = await researcher_agent.ainvoke({
    "messages": [HumanMessage(content="Research API rate limits")],
    "memory_context": "",
    "agent_id": "researcher",
})

# Writer graph retrieves those memories via the shared tag

async with httpx.AsyncClient() as client:
    response = await client.post(
        f"{MEMORY_URL}/api/memories/search",
        json={"query": "API rate limits", "tags": ["agent:researcher"]},
    )
    shared_context = response.json()["memories"]

writer_result = await writer_agent.ainvoke({
    "messages": [HumanMessage(content="Write a summary of API limits")],
    "memory_context": "\n".join(m["content"] for m in shared_context),
    "agent_id": "writer",
})

This pattern enables multi-agent pipelines where specialized graphs contribute to a collective knowledge base.

Conversation Summaries with Deduplication Control

When storing rolling summaries that should not be deduplicated (e.g., incremental conversation turns), attach a unique conversation_id to each memory. According to the source in docs/agents/langgraph.md, this bypasses deduplication logic for that specific conversation stream.

import uuid

conversation_id = str(uuid.uuid4())

async def store_turn_summary(turn: int, summary: str):
    async with httpx.AsyncClient() as client:
        await client.post(
            f"{MEMORY_URL}/api/memories",
            json={
                "content": f"Turn {turn}: {summary}",
                "tags": ["conversation-summary"],
                "conversation_id": conversation_id,  # bypasses dedup for this convo

                "memory_type": "note",
            },
        )

Use this pattern to maintain full history chains without the service collapsing sequential entries into a single memory.

Key Implementation Files

Reference these files in the doobidoo/mcp-memory-service repository when building your integration:

Summary

  • MCP Memory Service runs as a standalone HTTP server, providing durable storage that outlives individual LangGraph processes.
  • ReAct integration uses @tool decorated functions calling the service's /api/memories/search and /api/memories endpoints.
  • StateGraph integration defines explicit nodes (retrieve_memory_node, store_memory_node) for precise control over memory lifecycle.
  • Cross-graph sharing works by querying with shared tags, enabling multi-agent collaboration.
  • Metadata scoping uses agent_id in tags and conversation_id to control retrieval and deduplication behavior.

Frequently Asked Questions

How does MCP Memory Service differ from LangGraph's built-in MemorySaver?

LangGraph's MemorySaver stores state only inside a single StateGraph process, making it ephemeral and isolated to that runtime. MCP Memory Service persists data to disk or database via a REST API, allowing memories to survive process restarts and be accessed by multiple independent graphs simultaneously.

Can multiple LangGraph agents share the same memory store?

Yes. Because all agents communicate with the same HTTP endpoint (typically http://localhost:8000), they can share memories by using common tags in their search and store requests. For example, a researcher agent tagging memories with "agent:researcher" allows a writer agent to retrieve those memories by including the same tag in its search payload.

How do I prevent memory deduplication for ongoing conversations?

Pass a unique conversation_id parameter when storing memories. According to the implementation in docs/agents/langgraph.md, memories sharing the same conversation_id bypass the service's deduplication logic, allowing you to store incremental summaries for each turn without overwriting previous entries in that specific conversation stream.

Is MCP Memory Service compatible with the Model Context Protocol (MCP)?

Yes. While the integration examples above use the REST API for simplicity, the service also exposes an MCP protocol interface. The claude-hooks/utilities/memory-client.js file provides a unified client that can communicate via either HTTP or MCP, though Python-based LangGraph agents typically use the HTTP method shown here for easier async integration with httpx.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →