How to Set Up Headroom as an MCP Server with headroomcompress and headroomstats
To run Headroom in MCP mode, install the Python package, build the Rust binaries with cargo build --release --bins, then start the server with headroom server --mode mcp --compress-bin <path> --stats-bin <path>.
Headroom is an open-source project by chopratejas/headroom that functions as a Model-Cache-Provider (MCP) server, delegating text compression and cache statistics to two optimized Rust binaries. Setting up this server involves installing the Python HTTP API, compiling the helper tools, and launching the unified CLI. This guide walks through each step using the exact source files and commands found in the repository.
Install the Python Package
The Python side of Headroom provides the headroom CLI entry-point and the HTTP server that orchestrates requests. Install the package in a virtual environment using editable mode:
python -m venv .venv
source .venv/bin/activate
pip install -e .
The entry-point (headroom.server) is defined in pyproject.toml and registers the headroom CLI for global use inside the environment.
Build the Rust Helper Binaries
Headroom ships two Rust crates that compile into standalone command-line programs. The build configuration is declared in Cargo.toml at the repository root.
-
headroomcompress— Source logic lives insrc/bin/compress.rs. Build it with:cargo build --release --bin headroomcompress -
headroomstats— Source logic lives insrc/bin/stats.rs. Build it with:cargo build --release --bin headroomstats
Both executables are written to target/release/. Add this directory to your $PATH, or provide absolute paths when starting the server in the next step.
Launch the Headroom MCP Server
Start the server in MCP mode by passing the paths to both binaries. The critical flags are:
--mode mcp— Enables MCP server behavior.--compress-bin— Path to theheadroomcompressexecutable.--stats-bin— Path to theheadroomstatsexecutable.--port— Optional listening port; defaults to8000.
headroom server \
--mode mcp \
--compress-bin ./target/release/headroomcompress \
--stats-bin ./target/release/headroomstats \
--port 8000
The server binds to http://localhost:8000. Internally, it uses headroom/utils.py to spawn helper processes and headroom/transforms/kompress_compressor.py to decide when to invoke external compression.
Use the headroomcompress Tool
The compressor accepts plain text via standard input or file and returns a JSON object containing compressed and compression_ratio.
Command-Line Usage
Pipe text directly into the binary:
echo "This is a very long log that can be shortened…" | \
headroomcompress --format json
HTTP API Usage
You can also call the MCP endpoint directly:
import requests
payload = {
"content": "Your long text …",
"context": "optional-context"
}
r = requests.post("http://localhost:8000/compress", json=payload)
result = r.json()
print(result["compressed"])
When a request hits /compress, the server forwards the payload to headroomcompress and returns the parsed JSON result.
Use the headroomstats Tool
The statistics binary aggregates cache-hit data, token usage, and compression ratios. It can run standalone or serve data through the MCP HTTP API.
Command-Line Usage
Dump statistics to a JSON file:
headroomstats --dump-json > stats.json
HTTP API Usage
Query the live endpoint:
curl http://localhost:8000/stats | jq .
The returned JSON includes fields such as total_requests, cache_hits, and average_compression_ratio.
Complete MCP Workflow Example
Run the entire pipeline in one sequence:
# Build helpers
cargo build --release --bins
# Start MCP server in the background
headroom server \
--mode mcp \
--compress-bin ./target/release/headroomcompress \
--stats-bin ./target/release/headroomstats &
# Compress text via HTTP
python - <<'PY'
import requests
txt = "A very repetitive log …"
r = requests.post("http://localhost:8000/compress", json={"content": txt})
print("Compressed:", r.json()["compressed"])
PY
# Retrieve statistics
curl http://localhost:8000/stats | jq .
The server keeps the cache warm, reuses previously compressed fragments, and exposes live statistics for monitoring.
Key Source Files in headroom
Understanding the repository layout makes debugging and customization easier:
Cargo.toml— Declares theheadroomcompressandheadroomstatsbinaries.src/bin/compress.rs— Implements the core compression logic invoked by the server.src/bin/stats.rs— Implements cache-statistics reporting.headroom/utils.py— Contains helper functions for spawning binaries and parsing their output.headroom/transforms/kompress_compressor.py— Core Python wrapper that decides when to call the external compressor.pyproject.toml— Registers theheadroomconsole script and package metadata.
Summary
- Install the Python package with
pip install -e .to get theheadroomCLI. - Build the Rust tools using
cargo build --release --binsto produceheadroomcompressandheadroomstats. - Launch the server with
headroom server --mode mcpplus--compress-binand--stats-binflags. - Send compression requests to
POST /compressand statistics requests toGET /statsonlocalhost:8000. - The server relies on
src/bin/compress.rs,src/bin/stats.rs, andheadroom/utils.pyto process all MCP workloads.
Frequently Asked Questions
What does MCP mode mean in Headroom?
In the chopratejas/headroom codebase, MCP stands for Model-Cache-Provider. When you pass --mode mcp, the server starts an HTTP API that offloads compression and statistics work to the compiled Rust helper binaries instead of running everything inside the Python process.
Where do the helper binaries get compiled?
Running cargo build --release --bins places headroomcompress and headroomstats inside the target/release/ directory. This location is configured by the binary declarations in Cargo.toml.
Can I use headroomcompress without the HTTP server?
Yes. headroomcompress is a fully standalone CLI tool. You can pipe text into it directly or pass files on the command line, and it will emit JSON results without requiring the Python server to be running.
What information does headroomstats return?
According to the source implementation in src/bin/stats.rs, the tool returns aggregates such as total_requests, cache_hits, and average_compression_ratio. You can consume this data via the /stats endpoint or by running headroomstats --dump-json locally.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →