how-to-guide

How to Configure Per-Request Overrides for Headroom Compression Settings

June 8, 2026 chopratejas/headroom ↗

You can configure per-request overrides for Headroom compression settings by passing a CompressionPolicy object through the Python SDK or by sending the HEADROOM_COMPRESSION_PROFILE HTTP header, which takes precedence over all global defaults.

The chopratejas/headroom proxy compresses tool output before forwarding it to an LLM using policies defined in headroom/config.py. While PROFILE_PRESETS and DEFAULT_TOOL_PROFILES establish system-wide baselines, the source code supports temporary policy injection for individual requests. This article walks through the override mechanism as implemented in the transformer layer and proxy handlers, with copy-paste examples for the SDK and raw HTTP.

How Headroom's Compression Override System Works

Headroom’s compression is governed by the CompressionPolicy object, which determines how aggressively tool output is crushed. The proxy resolves the active policy through a strict hierarchy documented in wiki/configuration.md.

Policy Resolution Order

The configuration cascade is fixed: environment variables are evaluated first, then CLI flags, then SDK-level HeadroomClient configuration, and finally per-request header overrides. Because later stages always win, a per-request header or SDK argument overrides every upstream default.

Where Overrides Are Applied in smart_crusher.py

When a request arrives at the proxy, the headers are parsed and a RequestLog is created. The proxy searches for the HEADROOM_COMPRESSION_PROFILE header—or the equivalent SDK field—and builds a temporary CompressionPolicy that shadows global settings. As noted in the comments within headroom/transforms/smart_crusher.py around lines 233 and 483, the transformer retrieves this policy from the request context and feeds it into the SmartCrusher transformer's apply() method.

Inside apply(), the transformer selects the appropriate CompressionProfile for each tool—such as "conservative", "moderate", or "aggressive"—and computes the adaptive K parameter that drives the keeper-vs-culler decision.

Configuring Per-Request Overrides via the Python SDK

The headroom client lets you attach a custom policy to a single send() call without altering global configuration. The CompressionPolicy object accepts a tool_profiles dictionary that mirrors the shape of DEFAULT_TOOL_PROFILES in headroom/config.py.

from headroom import HeadroomClient, CompressionPolicy

# Build a client with default (moderate) settings

client = HeadroomClient()

# Create a per-request policy that forces aggressive compression for the Bash tool

policy = CompressionPolicy(
    tool_profiles={"Bash": {"bias": 0.7, "min_k": 3}}
)

# Attach the policy to a single request

response = client.send(
    messages=[{"role": "user", "content": "run a long bash script …"}],
    compression_policy=policy,
    stream=False,
)
print(response["content"])

Because the compression_policy argument is scoped to this request, the rest of your traffic continues using the production presets defined in PROFILE_PRESETS.

Sending Per-Request Overrides via HTTP Headers

If you call the proxy directly, per-request overrides travel through the HEADROOM_COMPRESSION_PROFILE header. Proxy handlers in headroom/proxy/handlers/*.py extract this header and inject the parsed policy into the transform chain.

Using a Named Profile

The simplest form sends a preset name:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer $HEADROOM_TOKEN" \
  -H "Content-Type: application/json" \
  -H "HEADROOM_COMPRESSION_PROFILE: aggressive" \
  -d '{
        "model": "gpt-4o-mini",
        "messages": [{"role":"user","content":"Explain the Rust ownership model"}]
      }'

Using a Custom JSON Payload

For multi-tool precision, send a JSON object that matches the CompressionPolicy schema:

HEADROOM_COMPRESSION_PROFILE: {
  "tool_profiles": {
    "Bash": {"bias": 0.6, "min_k": 5},
    "WebFetch": {"bias": 0.8, "min_k": 2}
  }
}

This header value is parsed by the proxy handlers in headroom/proxy/handlers/*.py, injected into the request context, and consumed by SmartCrusher during transformation.

Global Defaults and Fallback Behavior

Per-request overrides exist so you can experiment without destabilizing production traffic. If you need to shift the baseline for every request, set an environment variable or SDK-level default instead. The configuration hierarchy documented in wiki/configuration.md is strict: environment variables → CLI flags → SDK client configuration → per-request headers, with later stages always winning.


# Set an environment variable that applies to all requests unless overridden

export HEADROOM_BASH_PROFILE=conservative

The request-level policy is also captured in the RequestLog, so you can replay the exact compression behavior later for debugging or auditing.

Summary

Per-request overrides are implemented via the CompressionPolicy object and the HEADROOM_COMPRESSION_PROFILE header.
The resolution hierarchy in wiki/configuration.md guarantees that request-level settings beat environment variables, CLI flags, and SDK defaults.
The SmartCrusher.apply() method in headroom/transforms/smart_crusher.py reads the policy from context around lines ~233 and ~483 and drives adaptive crushing.
You can target individual tools like Bash or WebFetch with distinct bias and min_k values without affecting other traffic.
Every override is recorded alongside the request log, ensuring reproducible behavior.

Frequently Asked Questions

Can I override compression for only one tool in a multi-tool request?

Yes. Pass a CompressionPolicy whose tool_profiles dictionary contains only the tool you want to customize. Any tool omitted from the override map continues using the global defaults or presets defined in headroom/config.py.

Does a per-request header override the SDK `compression_policy` argument?

Both mechanisms are treated as per-request overrides. The hierarchy in wiki/configuration.md defines SDK-level HeadroomClient configuration as a global default stage, while the compression_policy argument in client.send() and the HEADROOM_COMPRESSION_PROFILE header are request-scoped. The proxy handlers in headroom/proxy/handlers/*.py inject whichever is present into the same request context, and both trump global defaults.

What happens if I pass an invalid `HEADROOM_COMPRESSION_PROFILE` value?

The proxy validates the header payload against the CompressionProfile dataclass shape defined in headroom/config.py. Invalid JSON or unrecognized fields typically cause the request to be rejected early, preventing silent fallback to an unintended compression level.

Is there a performance penalty for using per-request overrides?

No. The temporary CompressionPolicy is built once during request initialization and referenced by pointer in the transform chain. The overhead of parsing a small JSON header is negligible compared to the actual compression work performed by SmartCrusher.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how chopratejas/headroom works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →