# X-Agent-ID Scoping Strategy Performance Implications in MCP Memory Service

> Understand X-Agent-ID scoping strategy performance in MCP Memory Service. Discover how comma-separated tags impact retrieval speed and learn about its negligible write-time overhead.

- Repository: [Henry/mcp-memory-service](https://github.com/doobidoo/mcp-memory-service)
- Tags: performance
- Published: 2026-02-28

---

**The X-Agent-ID header introduces negligible write-time overhead but forces linear scan behavior during retrieval because agent tags are stored as comma-separated text without database indexing.**

The **X-Agent-ID** header in the `doobidoo/mcp-memory-service` repository provides automatic namespace isolation for multi-agent deployments. This scoping strategy appends an `agent:<id>` tag to every memory at ingestion time, creating logical partitions without separate database schemas. Understanding the performance characteristics of this approach is critical for deployments handling millions of memories across multiple agents.

## Write-Time Performance Impact

### Automatic Tag Injection and Deduplication

The header processing occurs in-process within the `POST /api/memories` endpoint before calling `MemoryService.store_memory`. When the middleware detects an **X-Agent-ID** header, it prepends the `agent:<id>` namespace tag—defined as `NAMESPACE_AGENT` in [`src/mcp_memory_service/models/tag_taxonomy.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/src/mcp_memory_service/models/tag_taxonomy.py)—to the request's tag list. This operation consists of a single string concatenation and an `in` check to prevent duplicates, resulting in **O(1)** time complexity and minimal memory overhead.

The deduplication logic lives in [`src/mcp_memory_service/web/api/memories.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/src/mcp_memory_service/web/api/memories.py) (lines 58-65). Before appending, the code verifies that the `agent:` prefix does not already exist in the tag array, ensuring idempotent behavior even if clients manually include the agent identifier. The unit test suite validates this behavior in [`tests/web/api/test_memories_api.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/tests/web/api/test_memories_api.py) (lines 36-52).

When storing a memory with automatic agent scoping:

```python
import requests

# The X-Agent-ID header causes the service to add `agent:researcher`

response = requests.post(
    "http://localhost:8000/api/memories",
    json={"content": "Findings about rate limits", "tags": ["api", "rate-limit"]},
    headers={"X-Agent-ID": "researcher"},
)
print(response.json()["memory"]["tags"])

# → ["api", "rate-limit", "agent:researcher"]

```

## Read Performance Characteristics

### GLOB Pattern Matching and Full Table Scans

Tags are persisted as a **comma-separated text field** across all storage backends (SQLite-Vec, Hybrid, Cloudflare). In [`src/mcp_memory_service/storage/sqlite_vec.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/src/mcp_memory_service/storage/sqlite_vec.py) (lines 28-46), the `search_by_tag` function constructs a **GLOB** pattern query to locate specific agent memories:

```sql
(',' || REPLACE(tags, ' ', '') || ',') GLOB '*,agent:researcher,*'

```

This pattern matching enables substring searches within the concatenated tag string but prevents the database engine from utilizing traditional B-Tree indexes on individual tag values. Because the **tags** column contains comma-delimited strings rather than normalized values, the `WHERE` clause cannot leverage indexed lookups.

### Linear Scan Behavior

The query engine must perform a **full table scan** (or at minimum, a scan of all rows matching other indexed predicates like `deleted_at IS NULL`). Performance degrades linearly with the total row count in the memories table, regardless of the specific `agent:` value being queried. A query for `agent:researcher` scans the same number of rows as a query for `agent:analyst` or any other tag.

### Application-Level Filtering Benefits

While the database scan remains exhaustive, the `agent:` namespace logically partitions result sets. When retrieving only the researcher’s memories:

```python
import requests

# Explicitly filter on the agent tag

response = requests.post(
    "http://localhost:8000/api/search/by-tag",
    json={"tags": ["agent:researcher"]},
)
for mem in response.json()["memories"]:
    print(mem["content"])

```

The server still scans all rows, but the in-memory filtering step discards non-matching records before serialization. This reduces network transfer and deserialization overhead when an agent's memories represent a small fraction of the global dataset.

## Storage Overhead and Scalability Limits

### Byte-Level Overhead

Adding the `agent:` tag increases row size by only a few bytes per memory. Even at millions of rows, this overhead remains negligible compared to the storage requirements for content payloads and vector embeddings (typically 384-1536 dimensions of floating-point data).

### Indexing Strategies for Scale

For high-volume deployments, the current scan-based approach creates a bottleneck as memory counts grow. Implementing a **dedicated tag index**—such as a many-to-many relationship table or SQLite FTS5/JSON1 extensions—would convert the linear scan into an indexed lookup. This optimization would eliminate the primary performance constraint of the X-Agent-ID scoping strategy for large-scale installations.

## Summary

- **Write latency** remains essentially unchanged at O(1) complexity with minimal CPU and memory overhead.
- **Read latency** scales linearly with total memory count due to comma-separated tag storage and GLOB pattern matching without index support.
- **Storage overhead** is trivial, adding only a few bytes per row regardless of dataset size.
- **Scalability limits** require attention for high-scale deployments; dedicated tag indexing is recommended for installations exceeding modest memory counts.

## Frequently Asked Questions

### Does X-Agent-ID slow down memory ingestion?

No. The header processing adds a single string concatenation and duplicate check before calling `MemoryService.store_memory`. This O(1) operation has negligible impact on ingestion throughput, as verified by the deduplication tests in [`tests/web/api/test_memories_api.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/tests/web/api/test_memories_api.py) (lines 36-52).

### Why are agent-scoped queries slow on large datasets?

The **tags** column stores comma-separated values as plain text, forcing the database to execute GLOB pattern matching across all rows. Without a separate index table or FTS extension, SQLite cannot perform indexed lookups on individual tags, resulting in full table scans that scale linearly with the total number of memories stored.

### Can I improve query performance without changing the schema?

Partially. While you cannot eliminate the underlying scan without schema modifications, you can reduce application-level latency by ensuring all retrieval requests include explicit `tags=["agent:<id>"]` filters. This minimizes the result set transferred to the client when the agent's data represents a small subset of the total repository.

### What is the recommended approach for high-scale multi-agent deployments?

For production systems handling millions of memories, migrate to a dedicated tag indexing strategy. Implementing a many-to-many relationship table between memories and tags, or utilizing SQLite's JSON1/FTS5 extensions, would enable sub-linear query performance and eliminate the current scan bottleneck described in [`src/mcp_memory_service/storage/sqlite_vec.py`](https://github.com/doobidoo/mcp-memory-service/blob/main/src/mcp_memory_service/storage/sqlite_vec.py).