How to Set Up Headroom MCP Server for Retrieval Tools: Complete Setup Guide

Run pip install "headroom-ai[mcp]", register the server with headroom mcp install, and start it with headroom mcp serve to expose headroom_retrieve, headroom_compress, and headroom_stats to any MCP-compatible host.

The chopratejas/headroom repository provides a lightweight, stdio-based MCP server that surfaces retrieval tools to AI coding assistants such as Claude Code, Cursor, and Codex. In headroom/ccr/mcp_server.py, the HeadroomMCPServer class orchestrates three tools while maintaining a local CompressionStore and an optional proxy fallback. Setting up the Headroom MCP server for retrieval tools requires only the Python package, a one-time host registration, and launching the server process.

Install the Headroom MCP Package

Headroom ships the MCP server as an optional extra. You can install the lightweight toolset alone or alongside the full proxy stack.

  1. Install only the MCP tools:
pip install "headroom-ai[mcp]"
  1. Or install the proxy and MCP tools together:
pip install "headroom-ai[proxy]"

The headroom-ai[mcp] extra includes everything required to run headroom mcp serve and register the server with your host, as documented in wiki/mcp.md.

Register the Server with Claude Code

Registration is a one-time step that writes the MCP server configuration to your host's config file.

headroom mcp install

According to headroom/cli/mcp.py, the install sub-command writes a TOML block into Claude Code's ~/.claude/mcp.json that points to headroom mcp serve. The registrar tests in tests/test_mcp_registry/test_codex_registrar.py verify the exact configuration format written by this command.

After registration, Claude Code automatically launches the server on startup and routes headroom_retrieve calls to it.

Start the Headroom MCP Server

Normally, the AI host starts the server automatically. For debugging or manual use, run it directly:

headroom mcp serve --debug

In headroom/cli/mcp.py (lines 330–352), the serve sub-command creates a HeadroomMCPServer via the create_ccr_mcp_server factory and calls run_stdio(). This opens a stdio transport and begins listening for JSON-RPC tool calls.

You can also verify your setup without launching a full session:

headroom mcp status

This command reports SDK availability, Claude config path, proxy URL, and proxy health, implemented in headroom/cli/mcp.py around lines 164–168.

Add a Proxy for Fallback Retrieval

Running the MCP server standalone keeps compressed content in a local CompressionStore. If you also run the Headroom proxy, the server falls back to the proxy when a requested hash is missing locally.

Start the proxy in a separate terminal:

headroom proxy

By default, the proxy listens at http://127.0.0.1:8787. When reachable, the MCP server's _retrieve_content method first queries the local store, then delegates to _retrieve_via_proxy via the proxy's /v1/retrieve endpoint if check_proxy=True and httpx is available. This dual-mode design lets the same toolset work with or without the full proxy stack.

How Retrieval Works Under the Hood

The headroom_retrieve tool handles lookups through a two-tier pipeline defined in headroom/ccr/mcp_server.py.

Local lookup first. The server lazily creates a CompressionStore (in _get_local_store) that holds original-to-compressed pairs with a default MCP_SESSION_TTL = 3600 seconds. When a host sends a headroom_retrieve call with a hash, _handle_retrieve (lines 443–468) checks this store first.

Proxy fallback. If the hash is absent and proxy checks are enabled, the server queries the Headroom proxy. The _retrieve_content implementation (lines 10–38) and _retrieve_via_proxy handle the network round-trip, returning either the original content or a 404 to the client.

Shared statistics. Every compression and retrieval appends a JSON line to ${HEADROOM_WORKSPACE_DIR}/session_stats.jsonl through _append_shared_event. Because all MCP instances—including sub-agent processes—write to the same file, headroom_stats can aggregate cross-process metrics.

Programmatically Create the Server

For custom scripts or tests, instantiate the server directly without the CLI:

from headroom.ccr.mcp_server import create_ccr_mcp_server

# Use the default proxy URL or pass your own

mcp_server = create_ccr_mcp_server(proxy_url="http://127.0.0.1:8787")

# Run the stdio loop in an async context

await mcp_server.run_stdio()

The create_ccr_mcp_server factory in headroom/ccr/mcp_server.py (lines 886–898) returns a fully configured HeadroomMCPServer ready for run_stdio().

Invoke Retrieval from an MCP Client

Any MCP-compatible client can call the retrieval tool once the server is running. The following pseudo-code demonstrates the pattern:

import json
import asyncio
from mcp.client import Client

async def retrieve(hash_key: str):
    async with Client("headroom") as client:
        result = await client.call_tool(
            "headroom_retrieve",
            {"hash": hash_key}
        )
        print(json.dumps(result, indent=2))

asyncio.run(retrieve("a1b2c3d4e5f6"))

The MCP SDK pipes stdin/stdout automatically, so no network code is required. The server receives the request through _handle_retrieve and returns the content sourced from either the local store or the proxy.

Summary

  • Install the server with pip install "headroom-ai[mcp]".
  • Register it once via headroom mcp install to write the host configuration.
  • Launch it with headroom mcp serve or let Claude Code start it automatically.
  • Retrieve content by hash through headroom_retrieve, which checks the local CompressionStore first.
  • Fallback to a running headroom proxy at http://127.0.0.1:8787 when the hash is not stored locally.
  • Monitor usage through the shared ${HEADROOM_WORKSPACE_DIR}/session_stats.jsonl file consumed by headroom_stats.

Frequently Asked Questions

What ports or network exposure does the MCP server require?

None. The HeadroomMCPServer is a stdio-only process. It communicates over stdin/stdout with the host, and any proxy communication happens over localhost HTTP to the optional headroom proxy. No inbound ports are opened by the MCP server itself.

Can I use Headroom retrieval without installing the proxy?

Yes. The MCP server stores compressed content in a local CompressionStore (managed in headroom/cache/compression_store.py) and serves retrieval requests directly. The proxy fallback in _retrieve_via_proxy is optional and only triggers when check_proxy=True, httpx is installed, and a proxy is reachable.

Where does the MCP server store session data?

It writes per-session events to ${HEADROOM_WORKSPACE_DIR}/session_stats.jsonl. All server instances append to this file, enabling headroom_stats to aggregate telemetry across parent and sub-agent processes. Compressed originals are kept in memory or the local store with a TTL defined by MCP_SESSION_TTL.

How do I uninstall the MCP server from Claude Code?

Run headroom mcp uninstall. This reverses the registration step by removing the Headroom entry from the host's MCP configuration file, as implemented in headroom/cli/mcp.py.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →