# How to Integrate MCP Memory Service with LangGraph for Persistent Agent State

> Integrate MCP Memory Service with LangGraph for persistent agent state. Wire REST endpoints into agents as LangChain tools or StateGraph nodes, replacing ephemeral memory with durable storage.

- Repository: [Henry/mcp-memory-service](https://github.com/doobidoo/mcp-memory-service)
- Tags: how-to-guide
- Published: 2026-02-28

---

**You integrate MCP Memory Service with LangGraph by running the standalone HTTP server and wiring its REST endpoints into your agents as LangChain tools or explicit StateGraph nodes, replacing the ephemeral MemorySaver with durable, cross-graph persistent storage.**

The `doobidoo/mcp-memory-service` repository provides a standalone memory store that exposes a REST API for creating, searching, and managing long-term memories. To integrate MCP Memory Service with LangGraph for agent state persistence, you route memory operations through the service's HTTP endpoints instead of relying on LangGraph's in-process MemorySaver, which loses state when the graph process terminates.

## Why LangGraph Needs External Memory Persistence

LangGraph's built-in `MemorySaver` keeps state only inside a single StateGraph process, making it unsuitable for durable persistence or multi-agent collaboration. MCP Memory Service solves these limitations by running as a standalone HTTP server with a SQLite or database backend.

**Key differences between LangGraph MemorySaver and MCP Memory Service:**

- **Persistence across runs**: LangGraph's saver is ephemeral (process-bound), while MCP Memory Service writes to disk/DB for durability.
- **Sharing between StateGraphs**: LangGraph isolates memory per process; the service enables cross-graph sharing via network requests.
- **Scalability**: LangGraph is limited to in-process only; the service is network-accessible for multi-host deployments.
- **Metadata scoping**: LangGraph offers limited tagging; the service supports rich metadata including `tags`, `conversation_id`, and `memory_type`.

## Start the MCP Memory Service Server

Begin by installing the package and launching the HTTP server. According to the integration guide in [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md), the server exposes a REST API on port 8000 by default.

```bash
pip install mcp-memory-service httpx
MCP_ALLOW_ANONYMOUS_ACCESS=true memory server --http

```

The `memory server --http` command (defined in [`run_server.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/run_server.py)) starts the service. Set `MCP_ALLOW_ANONYMOUS_ACCESS=true` for local development to skip authentication.

## Method 1: Memory Tools for ReAct Agents

The fastest way to add persistence is by wrapping MCP Memory Service calls as LangChain tools and passing them to LangGraph's prebuilt ReAct agent constructor.

Define `search_memory` and `store_memory` tools that call the service endpoints:

```python
import httpx
from langchain_core.tools import tool

MEMORY_URL = "http://localhost:8000"

@tool
async def search_memory(query: str) -> str:
    """Search long-term memory for relevant context."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{MEMORY_URL}/api/memories/search",
            json={"query": query, "limit": 5},
        )
        memories = response.json()["memories"]
        if not memories:
            return "No relevant memories found."
        return "\n".join(f"- {m['content']}" for m in memories)

@tool
async def store_memory(content: str, tags: list[str] = None) -> str:
    """Store a new memory for future retrieval."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{MEMORY_URL}/api/memories",
            json={"content": content, "tags": tags or []},
        )
        result = response.json()
        return f"Stored memory: {result.get('content_hash', 'unknown')}"

```

Pass these tools to `create_react_agent` as shown in [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md):

```python
from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-6")
agent = create_react_agent(
    llm,
    tools=[search_memory, store_memory],
    state_modifier="You have access to long-term memory. Search memory before answering. Store important findings."
)

```

The agent now automatically calls `search_memory` before answering and `store_memory` to persist findings, with all data stored durably in the MCP service.

## Method 2: Custom Memory Nodes in StateGraph

For fine-grained control over when and how memory is retrieved or stored, define explicit nodes in a custom `StateGraph`. This pattern, documented in [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md), lets you inject memory into system prompts and tag entries per-agent.

Define your state schema and memory nodes:

```python
import httpx
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_anthropic import ChatAnthropic

MEMORY_URL = "http://localhost:8000"

class AgentState(TypedDict):
    messages: Annotated[list, "message history"]
    memory_context: str
    agent_id: str

async def retrieve_memory_node(state: AgentState) -> dict:
    """Pull relevant memories before the LLM call."""
    last_message = state["messages"][-1]
    query = last_message.content if hasattr(last_message, "content") else str(last_message)

    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{MEMORY_URL}/api/memories/search",
            json={
                "query": query,
                "limit": 5,
                "tags": [f"agent:{state['agent_id']}"],  # per-agent scoping

            },
        )
        memories = response.json().get("memories", [])

    context = (
        "Relevant memory:\n" + "\n".join(f"- {m['content']}" for m in memories)
        if memories else ""
    )
    return {"memory_context": context}

async def llm_node(state: AgentState) -> dict:
    """Call the LLM with the injected memory context."""
    llm = ChatAnthropic(model="claude-sonnet-4-6")
    system_parts = ["You are a helpful assistant."]
    if state.get("memory_context"):
        system_parts.append(state["memory_context"])

    messages = [SystemMessage(content="\n\n".join(system_parts))] + state["messages"]
    response = await llm.ainvoke(messages)
    return {"messages": state["messages"] + [response]}

async def store_memory_node(state: AgentState) -> dict:
    """Persist the LLM's response for later reuse."""
    last_response = state["messages"][-1]
    content = last_response.content if hasattr(last_response, "content") else str(last_response)

    async with httpx.AsyncClient() as client:
        await client.post(
            f"{MEMORY_URL}/api/memories",
            json={
                "content": content[:500],  # store a concise summary

                "tags": [f"agent:{state['agent_id']}", "llm-response"],
                "memory_type": "observation",
            },
            headers={"X-Agent-ID": state["agent_id"]},
        )
    return {}

```

Compile the graph with explicit edges:

```python
graph = StateGraph(AgentState)
graph.add_node("retrieve_memory", retrieve_memory_node)
graph.add_node("llm", llm_node)
graph.add_node("store_memory", store_memory_node)

graph.set_entry_point("retrieve_memory")
graph.add_edge("retrieve_memory", "llm")
graph.add_edge("llm", "store_memory")
graph.add_edge("store_memory", END)

agent = graph.compile()

```

This implementation stores memories tagged with the specific `agent_id`, enabling precise scoping and retrieval control.

## Cross-Graph Memory Sharing

Because all graphs communicate with the same HTTP endpoint, you can share memories between different StateGraph instances by using shared tags. As implemented in [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md), one agent can write memories that another reads.

```python

# Researcher graph stores memories tagged "agent:researcher"

researcher_result = await researcher_agent.ainvoke({
    "messages": [HumanMessage(content="Research API rate limits")],
    "memory_context": "",
    "agent_id": "researcher",
})

# Writer graph retrieves those memories via the shared tag

async with httpx.AsyncClient() as client:
    response = await client.post(
        f"{MEMORY_URL}/api/memories/search",
        json={"query": "API rate limits", "tags": ["agent:researcher"]},
    )
    shared_context = response.json()["memories"]

writer_result = await writer_agent.ainvoke({
    "messages": [HumanMessage(content="Write a summary of API limits")],
    "memory_context": "\n".join(m["content"] for m in shared_context),
    "agent_id": "writer",
})

```

This pattern enables multi-agent pipelines where specialized graphs contribute to a collective knowledge base.

## Conversation Summaries with Deduplication Control

When storing rolling summaries that should not be deduplicated (e.g., incremental conversation turns), attach a unique `conversation_id` to each memory. According to the source in [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md), this bypasses deduplication logic for that specific conversation stream.

```python
import uuid

conversation_id = str(uuid.uuid4())

async def store_turn_summary(turn: int, summary: str):
    async with httpx.AsyncClient() as client:
        await client.post(
            f"{MEMORY_URL}/api/memories",
            json={
                "content": f"Turn {turn}: {summary}",
                "tags": ["conversation-summary"],
                "conversation_id": conversation_id,  # bypasses dedup for this convo

                "memory_type": "note",
            },
        )

```

Use this pattern to maintain full history chains without the service collapsing sequential entries into a single memory.

## Key Implementation Files

Reference these files in the `doobidoo/mcp-memory-service` repository when building your integration:

- [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md) – Complete integration guide with all code snippets referenced above.
- [`run_server.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/run_server.py) – Entry point for the HTTP server (`memory server --http`).
- [`claude-hooks/utilities/memory-client.js`](https://github.com/doobidoo/mcp-memory-service/blob/main/claude-hooks/utilities/memory-client.js) – Unified client supporting both HTTP and MCP protocol connections.
- [`pyproject.toml`](https://github.com/doobidoo/mcp-memory-service/blob/main/pyproject.toml) – Dependency declarations for `mcp-memory-service` and `httpx`.
- [`examples/setup/setup_multi_client_complete.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/examples/setup/setup_multi_client_complete.py) – Demonstrates scaling with multiple concurrent clients.

## Summary

- **MCP Memory Service** runs as a standalone HTTP server, providing durable storage that outlives individual LangGraph processes.
- **ReAct integration** uses `@tool` decorated functions calling the service's `/api/memories/search` and `/api/memories` endpoints.
- **StateGraph integration** defines explicit nodes (`retrieve_memory_node`, `store_memory_node`) for precise control over memory lifecycle.
- **Cross-graph sharing** works by querying with shared tags, enabling multi-agent collaboration.
- **Metadata scoping** uses `agent_id` in tags and `conversation_id` to control retrieval and deduplication behavior.

## Frequently Asked Questions

### How does MCP Memory Service differ from LangGraph's built-in MemorySaver?

LangGraph's `MemorySaver` stores state only inside a single StateGraph process, making it ephemeral and isolated to that runtime. MCP Memory Service persists data to disk or database via a REST API, allowing memories to survive process restarts and be accessed by multiple independent graphs simultaneously.

### Can multiple LangGraph agents share the same memory store?

Yes. Because all agents communicate with the same HTTP endpoint (typically `http://localhost:8000`), they can share memories by using common tags in their search and store requests. For example, a researcher agent tagging memories with `"agent:researcher"` allows a writer agent to retrieve those memories by including the same tag in its search payload.

### How do I prevent memory deduplication for ongoing conversations?

Pass a unique `conversation_id` parameter when storing memories. According to the implementation in [`docs/agents/langgraph.md`](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/agents/langgraph.md), memories sharing the same `conversation_id` bypass the service's deduplication logic, allowing you to store incremental summaries for each turn without overwriting previous entries in that specific conversation stream.

### Is MCP Memory Service compatible with the Model Context Protocol (MCP)?

Yes. While the integration examples above use the REST API for simplicity, the service also exposes an MCP protocol interface. The [`claude-hooks/utilities/memory-client.js`](https://github.com/doobidoo/mcp-memory-service/blob/main/claude-hooks/utilities/memory-client.js) file provides a unified client that can communicate via either HTTP or MCP, though Python-based LangGraph agents typically use the HTTP method shown here for easier async integration with `httpx`.