how-to-guide

How to Set Up Headroom as an MCP Server with headroomcompress and headroomstats

June 5, 2026 chopratejas/headroom ↗

To run Headroom in MCP mode, install the Python package, build the Rust binaries with cargo build --release --bins, then start the server with headroom server --mode mcp --compress-bin <path> --stats-bin <path>.

Headroom is an open-source project by chopratejas/headroom that functions as a Model-Cache-Provider (MCP) server, delegating text compression and cache statistics to two optimized Rust binaries. Setting up this server involves installing the Python HTTP API, compiling the helper tools, and launching the unified CLI. This guide walks through each step using the exact source files and commands found in the repository.

Install the Python Package

The Python side of Headroom provides the headroom CLI entry-point and the HTTP server that orchestrates requests. Install the package in a virtual environment using editable mode:

python -m venv .venv
source .venv/bin/activate
pip install -e .

The entry-point (headroom.server) is defined in pyproject.toml and registers the headroom CLI for global use inside the environment.

Build the Rust Helper Binaries

Headroom ships two Rust crates that compile into standalone command-line programs. The build configuration is declared in Cargo.toml at the repository root.

headroomcompress — Source logic lives in src/bin/compress.rs. Build it with:
```
cargo build --release --bin headroomcompress
```
headroomstats — Source logic lives in src/bin/stats.rs. Build it with:
```
cargo build --release --bin headroomstats
```

Both executables are written to target/release/. Add this directory to your $PATH, or provide absolute paths when starting the server in the next step.

Launch the Headroom MCP Server

Start the server in MCP mode by passing the paths to both binaries. The critical flags are:

--mode mcp — Enables MCP server behavior.
--compress-bin — Path to the headroomcompress executable.
--stats-bin — Path to the headroomstats executable.
--port — Optional listening port; defaults to 8000.

headroom server \
  --mode mcp \
  --compress-bin ./target/release/headroomcompress \
  --stats-bin ./target/release/headroomstats \
  --port 8000

The server binds to http://localhost:8000. Internally, it uses headroom/utils.py to spawn helper processes and headroom/transforms/kompress_compressor.py to decide when to invoke external compression.

Use the headroomcompress Tool

The compressor accepts plain text via standard input or file and returns a JSON object containing compressed and compression_ratio.

Command-Line Usage

Pipe text directly into the binary:

echo "This is a very long log that can be shortened…" | \
  headroomcompress --format json

HTTP API Usage

You can also call the MCP endpoint directly:

import requests

payload = {
    "content": "Your long text …",
    "context": "optional-context"
}
r = requests.post("http://localhost:8000/compress", json=payload)
result = r.json()
print(result["compressed"])

When a request hits /compress, the server forwards the payload to headroomcompress and returns the parsed JSON result.

Use the headroomstats Tool

The statistics binary aggregates cache-hit data, token usage, and compression ratios. It can run standalone or serve data through the MCP HTTP API.

Command-Line Usage

Dump statistics to a JSON file:

headroomstats --dump-json > stats.json

HTTP API Usage

Query the live endpoint:

curl http://localhost:8000/stats | jq .

The returned JSON includes fields such as total_requests, cache_hits, and average_compression_ratio.

Complete MCP Workflow Example

Run the entire pipeline in one sequence:


# Build helpers

cargo build --release --bins

# Start MCP server in the background

headroom server \
  --mode mcp \
  --compress-bin ./target/release/headroomcompress \
  --stats-bin ./target/release/headroomstats &

# Compress text via HTTP

python - <<'PY'
import requests
txt = "A very repetitive log …"
r = requests.post("http://localhost:8000/compress", json={"content": txt})
print("Compressed:", r.json()["compressed"])
PY

# Retrieve statistics

curl http://localhost:8000/stats | jq .

The server keeps the cache warm, reuses previously compressed fragments, and exposes live statistics for monitoring.

Key Source Files in headroom

Understanding the repository layout makes debugging and customization easier:

Cargo.toml — Declares the headroomcompress and headroomstats binaries.
src/bin/compress.rs — Implements the core compression logic invoked by the server.
src/bin/stats.rs — Implements cache-statistics reporting.
headroom/utils.py — Contains helper functions for spawning binaries and parsing their output.
headroom/transforms/kompress_compressor.py — Core Python wrapper that decides when to call the external compressor.
pyproject.toml — Registers the headroom console script and package metadata.

Summary

Install the Python package with pip install -e . to get the headroom CLI.
Build the Rust tools using cargo build --release --bins to produce headroomcompress and headroomstats.
Launch the server with headroom server --mode mcp plus --compress-bin and --stats-bin flags.
Send compression requests to POST /compress and statistics requests to GET /stats on localhost:8000.
The server relies on src/bin/compress.rs, src/bin/stats.rs, and headroom/utils.py to process all MCP workloads.

Frequently Asked Questions

What does MCP mode mean in Headroom?

In the chopratejas/headroom codebase, MCP stands for Model-Cache-Provider. When you pass --mode mcp, the server starts an HTTP API that offloads compression and statistics work to the compiled Rust helper binaries instead of running everything inside the Python process.

Where do the helper binaries get compiled?

Running cargo build --release --bins places headroomcompress and headroomstats inside the target/release/ directory. This location is configured by the binary declarations in Cargo.toml.

Can I use headroomcompress without the HTTP server?

Yes. headroomcompress is a fully standalone CLI tool. You can pipe text into it directly or pass files on the command line, and it will emit JSON results without requiring the Python server to be running.

What information does headroomstats return?

According to the source implementation in src/bin/stats.rs, the tool returns aggregates such as total_requests, cache_hits, and average_compression_ratio. You can consume this data via the /stats endpoint or by running headroomstats --dump-json locally.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how chopratejas/headroom works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →