How to Configure the Compression Policy for Different Content Types in Headroom

Headroom uses a per-auth-mode CompressionPolicy to control compression aggressiveness, while the ContentRouter automatically detects content types and dispatches to specialized compressors that respect the active policy settings.

Headroom is an open-source LLM context window optimizer that applies intelligent compression to reduce token counts without losing semantic meaning. Configuring the compression policy for different content types allows you to balance token reduction against fidelity for code, logs, search results, and plain text. The configuration system combines auth-mode policies with runtime environment variables to determine how each compressor behaves.

Understanding the CompressionPolicy Data Class

The canonical definition lives in [headroom/transforms/compression_policy.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/compression_policy.py). This dataclass contains the global knobs that affect how aggressively all compressors operate:

  • live_zone_only: When True, transforms may not modify bytes outside the post-cache-marker live zone. Subscription mode typically sets this to True for safety.
  • cache_aligner_enabled: Gates the CacheAligner transform. Disabled (False) in Subscription mode to prevent cache-instability-related rewrites.
  • volatile_token_threshold: Token-count threshold below which content is considered cache-stable. PAYG defaults to 128, while Subscription uses 32.
  • max_lossy_ratio: Upper bound on how much lossy compression may drop tokens (as a fraction of original). PAYG allows up to 0.45, Subscription caps at 0.25.
  • toin_read_only: When True, the TOIN learning component only serves cached patterns without writing new observations.

The policy is resolved per request via resolve_policy(auth_mode). When the environment variable HEADROOM_PROXY_AUTH_MODE_POLICY_ENFORCEMENT is set to off, a PAYG-equivalent policy is always returned regardless of the detected auth mode.

from headroom.transforms.compression_policy import resolve_policy
from headroom.proxy.auth_mode import AuthMode

# Resolve policy for a specific authentication mode

policy = resolve_policy(AuthMode.PAYG)
print(policy.max_lossy_ratio)  # 0.45

Content-Type Detection and Routing

Headroom does not use MIME types or file extensions to determine content types. Instead, [headroom/transforms/content_router.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) inspects the shape of the incoming payload and selects the appropriate compressor from [headroom/transforms/compression_units.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/compression_units.py):

Content Type Compressor Used Policy Fields Consulted
Code (Python, Go, Rust, etc.) CodeCompressor (AST-based) live_zone_only, cache_aligner_enabled
Logs LogCompressor max_lossy_ratio
Search output / grep SearchCompressor volatile_token_threshold
Plain text SmartCrusher (lossless JSON) + optional ML (KompressCompressor if [ml] extra installed) max_lossy_ratio

The ContentRouter reads the CompressionPolicy returned from resolve_policy() and passes relevant flags to each transform. For example, setting cache_aligner_enabled=False (the default in Subscription mode) disables the CacheAligner step for all content types, which prevents rewrites that could destabilize cached contexts.

Enabling and Disabling Compressors via Environment Variables

While the policy controls aggressiveness, you toggle individual compressors at runtime using environment variables or CLI flags documented in [wiki/proxy.md](https://github.com/chopratejas/headroom/blob/main/wiki/proxy.md):

Flag / Environment Variable Effect
--code-aware / HEADROOM_CODE_AWARE_ENABLED=true Enable AST-based code compression (default true).
--no-code-aware Explicitly disable code compression for pure-text workloads.
--enable-search-compression / HEADROOM_SEARCH_COMPRESSION_ENABLED=true Enable the SearchCompressor for grep and search results.
--enable-log-compression / HEADROOM_LOG_COMPRESSION_ENABLED=true Enable the LogCompressor for log file compression.
--no-compress-first Skip the "try deeper compression first" heuristic.
HEADROOM_PROXY_AUTH_MODE_POLICY_ENFORCEMENT=off Force PAYG-equivalent policy for every request, overriding auth-mode-specific settings.

To run a proxy that compresses only code and logs while leaving search output untouched:

HEADROOM_CODE_AWARE_ENABLED=true \
HEADROOM_LOG_COMPRESSION_ENABLED=true \
HEADROOM_SEARCH_COMPRESSION_ENABLED=false \
HEADROOM_PROXY_AUTH_MODE_POLICY_ENFORCEMENT=enabled \
headroom-proxy --mode token

Customizing Compression Policies for Individual Requests

For advanced use cases, you can override the default policy for a single request by constructing a custom CompressionPolicy and injecting it into the ContentRouter. This allows different compression aggression levels for code versus logs within the same pipeline:

from headroom.transforms.compression_policy import CompressionPolicy
from headroom.transforms.content_router import ContentRouter

# Build a custom policy for aggressive code compression

custom_policy = CompressionPolicy(
    live_zone_only=False,                # Allow live-zone edits (more aggressive)

    cache_aligner_enabled=True,
    volatile_token_threshold=200,        # Raise threshold for code files

    max_lossy_ratio=0.30,                # Allow up to 30% loss for logs

    toin_read_only=False,
)

# Initialize router with custom policy

router = ContentRouter(policy=custom_policy)

# Apply compression to request payload

compressed = router.apply(request_payload)
print(f"Compression ratio: {compressed['compression_ratio']}")

If you are using the TypeScript SDK, the same logic is exposed via the HTTP endpoint POST /v1/compress. The proxy reads the policy from the request's auth mode (or from environment-controlled defaults) and applies the appropriate compressors automatically.

Summary

  • Policy Resolution: The CompressionPolicy dataclass in headroom/transforms/compression_policy.py defines per-auth-mode constraints that all compressors must respect.
  • Content Routing: The ContentRouter in headroom/transforms/content_router.py inspects payload structure to dispatch to CodeCompressor, LogCompressor, SearchCompressor, or SmartCrusher as appropriate.
  • Runtime Control: Use environment variables like HEADROOM_CODE_AWARE_ENABLED or CLI flags like --enable-log-compression to toggle specific compressors without changing code.
  • Request-Level Override: Instantiate a custom CompressionPolicy and pass it directly to ContentRouter for dynamic, per-request customization.

Frequently Asked Questions

What is the difference between PAYG and Subscription compression policies?

PAYG (Pay-As-You-Go) policies allow higher max_lossy_ratio (up to 0.45) and higher volatile_token_threshold (128 tokens), prioritizing maximum token reduction. Subscription policies are more conservative, limiting loss to 0.25 ratio and 32 tokens threshold, with live_zone_only set to True to protect cached contexts from modification.

How do I completely disable code-aware compression for a specific workload?

Set the environment variable HEADROOM_CODE_AWARE_ENABLED=false or pass the --no-code-aware flag when starting the proxy. This forces the pipeline to treat all content as plain text, bypassing the AST-based CodeCompressor entirely.

Can I apply different compression ratios for logs versus code in the same request?

The max_lossy_ratio in CompressionPolicy applies globally to all lossy compressors in a single request. To achieve different ratios for different content types, you must split the request into separate calls with different policy instances, or modify the source code in headroom/transforms/content_router.py to accept per-content-type thresholds.

Where does Headroom detect the content type of incoming messages?

Content type detection happens in [headroom/transforms/content_router.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py), which analyzes the shape and structure of the payload rather than relying on MIME types or file extensions. The router then selects the appropriate compressor from [headroom/transforms/compression_units.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/compression_units.py) based on detected patterns.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →