How to Configure the Compression Policy for Different Content Types in Headroom
Headroom uses a per-auth-mode CompressionPolicy to control compression aggressiveness, while the ContentRouter automatically detects content types and dispatches to specialized compressors that respect the active policy settings.
Headroom is an open-source LLM context window optimizer that applies intelligent compression to reduce token counts without losing semantic meaning. Configuring the compression policy for different content types allows you to balance token reduction against fidelity for code, logs, search results, and plain text. The configuration system combines auth-mode policies with runtime environment variables to determine how each compressor behaves.
Understanding the CompressionPolicy Data Class
The canonical definition lives in [headroom/transforms/compression_policy.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/compression_policy.py). This dataclass contains the global knobs that affect how aggressively all compressors operate:
live_zone_only: WhenTrue, transforms may not modify bytes outside the post-cache-marker live zone. Subscription mode typically sets this toTruefor safety.cache_aligner_enabled: Gates theCacheAlignertransform. Disabled (False) in Subscription mode to prevent cache-instability-related rewrites.volatile_token_threshold: Token-count threshold below which content is considered cache-stable. PAYG defaults to128, while Subscription uses32.max_lossy_ratio: Upper bound on how much lossy compression may drop tokens (as a fraction of original). PAYG allows up to0.45, Subscription caps at0.25.toin_read_only: WhenTrue, the TOIN learning component only serves cached patterns without writing new observations.
The policy is resolved per request via resolve_policy(auth_mode). When the environment variable HEADROOM_PROXY_AUTH_MODE_POLICY_ENFORCEMENT is set to off, a PAYG-equivalent policy is always returned regardless of the detected auth mode.
from headroom.transforms.compression_policy import resolve_policy
from headroom.proxy.auth_mode import AuthMode
# Resolve policy for a specific authentication mode
policy = resolve_policy(AuthMode.PAYG)
print(policy.max_lossy_ratio) # 0.45
Content-Type Detection and Routing
Headroom does not use MIME types or file extensions to determine content types. Instead, [headroom/transforms/content_router.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) inspects the shape of the incoming payload and selects the appropriate compressor from [headroom/transforms/compression_units.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/compression_units.py):
| Content Type | Compressor Used | Policy Fields Consulted |
|---|---|---|
| Code (Python, Go, Rust, etc.) | CodeCompressor (AST-based) |
live_zone_only, cache_aligner_enabled |
| Logs | LogCompressor |
max_lossy_ratio |
| Search output / grep | SearchCompressor |
volatile_token_threshold |
| Plain text | SmartCrusher (lossless JSON) + optional ML (KompressCompressor if [ml] extra installed) |
max_lossy_ratio |
The ContentRouter reads the CompressionPolicy returned from resolve_policy() and passes relevant flags to each transform. For example, setting cache_aligner_enabled=False (the default in Subscription mode) disables the CacheAligner step for all content types, which prevents rewrites that could destabilize cached contexts.
Enabling and Disabling Compressors via Environment Variables
While the policy controls aggressiveness, you toggle individual compressors at runtime using environment variables or CLI flags documented in [wiki/proxy.md](https://github.com/chopratejas/headroom/blob/main/wiki/proxy.md):
| Flag / Environment Variable | Effect |
|---|---|
--code-aware / HEADROOM_CODE_AWARE_ENABLED=true |
Enable AST-based code compression (default true). |
--no-code-aware |
Explicitly disable code compression for pure-text workloads. |
--enable-search-compression / HEADROOM_SEARCH_COMPRESSION_ENABLED=true |
Enable the SearchCompressor for grep and search results. |
--enable-log-compression / HEADROOM_LOG_COMPRESSION_ENABLED=true |
Enable the LogCompressor for log file compression. |
--no-compress-first |
Skip the "try deeper compression first" heuristic. |
HEADROOM_PROXY_AUTH_MODE_POLICY_ENFORCEMENT=off |
Force PAYG-equivalent policy for every request, overriding auth-mode-specific settings. |
To run a proxy that compresses only code and logs while leaving search output untouched:
HEADROOM_CODE_AWARE_ENABLED=true \
HEADROOM_LOG_COMPRESSION_ENABLED=true \
HEADROOM_SEARCH_COMPRESSION_ENABLED=false \
HEADROOM_PROXY_AUTH_MODE_POLICY_ENFORCEMENT=enabled \
headroom-proxy --mode token
Customizing Compression Policies for Individual Requests
For advanced use cases, you can override the default policy for a single request by constructing a custom CompressionPolicy and injecting it into the ContentRouter. This allows different compression aggression levels for code versus logs within the same pipeline:
from headroom.transforms.compression_policy import CompressionPolicy
from headroom.transforms.content_router import ContentRouter
# Build a custom policy for aggressive code compression
custom_policy = CompressionPolicy(
live_zone_only=False, # Allow live-zone edits (more aggressive)
cache_aligner_enabled=True,
volatile_token_threshold=200, # Raise threshold for code files
max_lossy_ratio=0.30, # Allow up to 30% loss for logs
toin_read_only=False,
)
# Initialize router with custom policy
router = ContentRouter(policy=custom_policy)
# Apply compression to request payload
compressed = router.apply(request_payload)
print(f"Compression ratio: {compressed['compression_ratio']}")
If you are using the TypeScript SDK, the same logic is exposed via the HTTP endpoint POST /v1/compress. The proxy reads the policy from the request's auth mode (or from environment-controlled defaults) and applies the appropriate compressors automatically.
Summary
- Policy Resolution: The
CompressionPolicydataclass inheadroom/transforms/compression_policy.pydefines per-auth-mode constraints that all compressors must respect. - Content Routing: The
ContentRouterinheadroom/transforms/content_router.pyinspects payload structure to dispatch toCodeCompressor,LogCompressor,SearchCompressor, orSmartCrusheras appropriate. - Runtime Control: Use environment variables like
HEADROOM_CODE_AWARE_ENABLEDor CLI flags like--enable-log-compressionto toggle specific compressors without changing code. - Request-Level Override: Instantiate a custom
CompressionPolicyand pass it directly toContentRouterfor dynamic, per-request customization.
Frequently Asked Questions
What is the difference between PAYG and Subscription compression policies?
PAYG (Pay-As-You-Go) policies allow higher max_lossy_ratio (up to 0.45) and higher volatile_token_threshold (128 tokens), prioritizing maximum token reduction. Subscription policies are more conservative, limiting loss to 0.25 ratio and 32 tokens threshold, with live_zone_only set to True to protect cached contexts from modification.
How do I completely disable code-aware compression for a specific workload?
Set the environment variable HEADROOM_CODE_AWARE_ENABLED=false or pass the --no-code-aware flag when starting the proxy. This forces the pipeline to treat all content as plain text, bypassing the AST-based CodeCompressor entirely.
Can I apply different compression ratios for logs versus code in the same request?
The max_lossy_ratio in CompressionPolicy applies globally to all lossy compressors in a single request. To achieve different ratios for different content types, you must split the request into separate calls with different policy instances, or modify the source code in headroom/transforms/content_router.py to accept per-content-type thresholds.
Where does Headroom detect the content type of incoming messages?
Content type detection happens in [headroom/transforms/content_router.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py), which analyzes the shape and structure of the payload rather than relying on MIME types or file extensions. The router then selects the appropriate compressor from [headroom/transforms/compression_units.py](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/compression_units.py) based on detected patterns.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →