# How to Configure Per-Request Overrides for Headroom Compression Settings

> Configure per-request overrides for Headroom compression settings via Python SDK or HEADROOM_COMPRESSION_PROFILE header. Gain granular control over compression for specific requests.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: how-to-guide
- Published: 2026-06-08

---

**You can configure per-request overrides for Headroom compression settings by passing a `CompressionPolicy` object through the Python SDK or by sending the `HEADROOM_COMPRESSION_PROFILE` HTTP header, which takes precedence over all global defaults.**

The `chopratejas/headroom` proxy compresses tool output before forwarding it to an LLM using policies defined in [`headroom/config.py`](https://github.com/chopratejas/headroom/blob/main/headroom/config.py). While **`PROFILE_PRESETS`** and **`DEFAULT_TOOL_PROFILES`** establish system-wide baselines, the source code supports temporary policy injection for individual requests. This article walks through the override mechanism as implemented in the transformer layer and proxy handlers, with copy-paste examples for the SDK and raw HTTP.

## How Headroom's Compression Override System Works

Headroom’s compression is governed by the **`CompressionPolicy`** object, which determines how aggressively tool output is crushed. The proxy resolves the active policy through a strict hierarchy documented in [`wiki/configuration.md`](https://github.com/chopratejas/headroom/blob/main/wiki/configuration.md).

### Policy Resolution Order

The configuration cascade is fixed: environment variables are evaluated first, then CLI flags, then SDK-level `HeadroomClient` configuration, and finally per-request header overrides. Because later stages always win, a per-request header or SDK argument overrides every upstream default.

### Where Overrides Are Applied in smart_crusher.py

When a request arrives at the proxy, the headers are parsed and a **`RequestLog`** is created. The proxy searches for the `HEADROOM_COMPRESSION_PROFILE` header—or the equivalent SDK field—and builds a temporary **`CompressionPolicy`** that shadows global settings. As noted in the comments within **[`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py)** around lines 233 and 483, the transformer retrieves this policy from the request context and feeds it into the **`SmartCrusher`** transformer's `apply()` method.

Inside `apply()`, the transformer selects the appropriate **`CompressionProfile`** for each tool—such as "conservative", "moderate", or "aggressive"—and computes the adaptive *K* parameter that drives the keeper-vs-culler decision.

## Configuring Per-Request Overrides via the Python SDK

The `headroom` client lets you attach a custom policy to a single `send()` call without altering global configuration. The `CompressionPolicy` object accepts a `tool_profiles` dictionary that mirrors the shape of `DEFAULT_TOOL_PROFILES` in [`headroom/config.py`](https://github.com/chopratejas/headroom/blob/main/headroom/config.py).

```python
from headroom import HeadroomClient, CompressionPolicy

# Build a client with default (moderate) settings

client = HeadroomClient()

# Create a per-request policy that forces aggressive compression for the Bash tool

policy = CompressionPolicy(
    tool_profiles={"Bash": {"bias": 0.7, "min_k": 3}}
)

# Attach the policy to a single request

response = client.send(
    messages=[{"role": "user", "content": "run a long bash script …"}],
    compression_policy=policy,
    stream=False,
)
print(response["content"])

```

Because the `compression_policy` argument is scoped to this request, the rest of your traffic continues using the production presets defined in `PROFILE_PRESETS`.

## Sending Per-Request Overrides via HTTP Headers

If you call the proxy directly, per-request overrides travel through the `HEADROOM_COMPRESSION_PROFILE` header. Proxy handlers in **`headroom/proxy/handlers/*.py`** extract this header and inject the parsed policy into the transform chain.

### Using a Named Profile

The simplest form sends a preset name:

```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer $HEADROOM_TOKEN" \
  -H "Content-Type: application/json" \
  -H "HEADROOM_COMPRESSION_PROFILE: aggressive" \
  -d '{
        "model": "gpt-4o-mini",
        "messages": [{"role":"user","content":"Explain the Rust ownership model"}]
      }'

```

### Using a Custom JSON Payload

For multi-tool precision, send a JSON object that matches the `CompressionPolicy` schema:

```http
HEADROOM_COMPRESSION_PROFILE: {
  "tool_profiles": {
    "Bash": {"bias": 0.6, "min_k": 5},
    "WebFetch": {"bias": 0.8, "min_k": 2}
  }
}

```

This header value is parsed by the proxy handlers in `headroom/proxy/handlers/*.py`, injected into the request context, and consumed by `SmartCrusher` during transformation.

## Global Defaults and Fallback Behavior

Per-request overrides exist so you can experiment without destabilizing production traffic. If you need to shift the baseline for every request, set an environment variable or SDK-level default instead. The configuration hierarchy documented in [`wiki/configuration.md`](https://github.com/chopratejas/headroom/blob/main/wiki/configuration.md) is strict: environment variables → CLI flags → SDK client configuration → per-request headers, with later stages always winning.

```bash

# Set an environment variable that applies to all requests unless overridden

export HEADROOM_BASH_PROFILE=conservative

```

The request-level policy is also captured in the **`RequestLog`**, so you can replay the exact compression behavior later for debugging or auditing.

## Summary

- **Per-request overrides** are implemented via the `CompressionPolicy` object and the `HEADROOM_COMPRESSION_PROFILE` header.
- The resolution hierarchy in [`wiki/configuration.md`](https://github.com/chopratejas/headroom/blob/main/wiki/configuration.md) guarantees that request-level settings beat environment variables, CLI flags, and SDK defaults.
- The `SmartCrusher.apply()` method in [`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py) reads the policy from context around lines ~233 and ~483 and drives adaptive crushing.
- You can target individual tools like `Bash` or `WebFetch` with distinct `bias` and `min_k` values without affecting other traffic.
- Every override is recorded alongside the request log, ensuring reproducible behavior.

## Frequently Asked Questions

### Can I override compression for only one tool in a multi-tool request?

Yes. Pass a `CompressionPolicy` whose `tool_profiles` dictionary contains only the tool you want to customize. Any tool omitted from the override map continues using the global defaults or presets defined in [`headroom/config.py`](https://github.com/chopratejas/headroom/blob/main/headroom/config.py).

### Does a per-request header override the SDK `compression_policy` argument?

Both mechanisms are treated as per-request overrides. The hierarchy in [`wiki/configuration.md`](https://github.com/chopratejas/headroom/blob/main/wiki/configuration.md) defines SDK-level `HeadroomClient` configuration as a global default stage, while the `compression_policy` argument in `client.send()` and the `HEADROOM_COMPRESSION_PROFILE` header are request-scoped. The proxy handlers in `headroom/proxy/handlers/*.py` inject whichever is present into the same request context, and both trump global defaults.

### What happens if I pass an invalid `HEADROOM_COMPRESSION_PROFILE` value?

The proxy validates the header payload against the `CompressionProfile` dataclass shape defined in [`headroom/config.py`](https://github.com/chopratejas/headroom/blob/main/headroom/config.py). Invalid JSON or unrecognized fields typically cause the request to be rejected early, preventing silent fallback to an unintended compression level.

### Is there a performance penalty for using per-request overrides?

No. The temporary `CompressionPolicy` is built once during request initialization and referenced by pointer in the transform chain. The overhead of parsing a small JSON header is negligible compared to the actual compression work performed by `SmartCrusher`.