How to Debug Compression Issues with Headroom's Observability and Logging

Enable DEBUG logging on the headroom logger, ensure every transform receives a CompressionObserver, and cross-reference structured log messages with per-strategy Prometheus counters to isolate missing or failed compression steps.

Headroom is an open-source LLM proxy and compression toolkit that instruments every compression step with structured logs and mandatory observer callbacks. To debug compression issues with Headroom's observability and logging, you need to correlate Python logging output from transforms like SmartCrusher with metrics recorded by the CompressionObserver interface. The source code in chopratejas/headroom makes this straightforward by emitting detailed token counts and enforcing observer injection at the transform level.

How Headroom's Compression Observability Works

Headroom instruments every compression step through two complementary mechanisms: a Python logging.Logger named "headroom" (or "headroom.proxy" for the proxy) and a lightweight CompressionObserver interface. After the ContentRouter selects a transform and the Rust-backed compressor in headroom/_core finishes processing, the transform invokes the observer and emits a structured log message simultaneously. This dual emission means you can verify behavior in logs and independently confirm it in metrics.

The Mandatory CompressionObserver Protocol

The CompressionObserver protocol is defined in headroom/transforms/observability.py at lines 39-78. It expects a single callback per real compression event, and the codebase explicitly forbids a no-op fallback (see the comment in observability.py at lines 18-22). The default implementation in headroom/proxy/prometheus_metrics.py increments a per-strategy counter via record_compression(), which instantly surfaces regressions such as a missing SmartCrusher event.

Per-Transform Structured Logging

Each transform emits detailed DEBUG/INFO messages through a module-level logging.Logger. For example, headroom/transforms/smart_crusher.py at line 46 creates logger = logging.getLogger(__name__), and headroom/transforms/log_compressor.py at line 45 does the same. These loggers report which transform ran, the token counts before and after, and any errors encountered during the Rust-backed compression pass.

Debug Compression Issues Using Headroom Logs and Metrics

  1. Enable full logging. Set the root logger or the "headroom" logger to DEBUG. The default logging bootstrap lives in headroom/proxy/server.py at line 183.

  2. Verify that each transform logs. Look for messages like "Compressing with smart_crusher" that display the original and compressed token counts. Each transform creates its own logger via logger = logging.getLogger(__name__), as seen at the top of headroom/transforms/smart_crusher.py at line 46.

  3. Check Prometheus counters. Query the headroom_compression_total metric to see per-strategy counts. The observer implementation lives in headroom/proxy/prometheus_metrics.py, where record_compression increments the counter.

  4. Use the SDK simulate helper. Run client.chat.completions.simulate(...) and inspect plan.tokens_before, plan.tokens_after, and plan.transforms. The simulation code is exercised in the troubleshooting guide at wiki/troubleshooting.md.

  5. Pull stored metrics. Call client.get_metrics(...) to retrieve a time-series of token savings and compression events. The metric API is defined in headroom/client.py.

  6. Correlate logs with observer data. Match log timestamps with Prometheus counter increments. A missing increment despite a log entry means the observer was not wired, which commonly happens when a custom transform forgets to pass observer. The CompressionObserver contract is documented in headroom/transforms/observability.py at lines 44-57. If you see zero compression events in Prometheus but the logs show that a compressor executed, the transform was likely instantiated without an observer (see the note in headroom/transforms/smart_crusher.py at lines 175-180). Passing observer=get_otel_metrics() or the default Prometheus observer fixes this.

Essential Code Examples for Headroom Observability

Enable Debug Logging

import logging

logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s %(name)s %(levelname)s %(message)s",
)

# Or target only Headroom logs:

logging.getLogger("headroom").setLevel(logging.DEBUG)

Attach an Observer to a Smart Crusher Instance

from headroom.transforms.smart_crusher import SmartCrusher, SmartCrusherConfig
from headroom.observability import get_otel_metrics  # or Prometheus observer

crusher = SmartCrusher(
    config=SmartCrusherConfig(),
    observer=get_otel_metrics()   # ensures metrics are emitted

)

Simulate a Compression Pass

plan = client.chat.completions.simulate(
    model="gpt-4o",
    messages=[{"role": "user", "content": "...."}],
)

print(f"Tokens: {plan.tokens_before}{plan.tokens_after}")
print("Transforms applied:", plan.transforms)

Retrieve Recent Metrics

from datetime import datetime, timedelta

metrics = client.get_metrics(
    start_time=datetime.utcnow() - timedelta(hours=1),
    limit=20,
)

for m in metrics:
    print(f"{m.timestamp}: {m.tokens_input_before}{m.tokens_input_after}")
    print("  Compression events:", m.compression_events)

Summary

  • Debug compression issues with Headroom's observability and logging by correlating structured DEBUG logs from the "headroom" logger with per-strategy Prometheus counters emitted through the CompressionObserver protocol.
  • Always verify that transforms like SmartCrusher receive a valid observer argument, because the codebase forbids no-op fallbacks and missing observers are the most common source of silent metric regressions.
  • Use the client.chat.completions.simulate() helper to inspect token deltas and transform plans locally before deploying changes.
  • Query historical data via client.get_metrics() to identify trends in token savings and compression event frequency.

Frequently Asked Questions

What does the CompressionObserver protocol do in Headroom?

The CompressionObserver protocol in headroom/transforms/observability.py defines a single callback, record_compression(), that receives one notification per real compression event. The default implementation increments a Prometheus counter, giving operators a real-time signal of which strategies are active and whether any step has been bypassed.

Why are my Prometheus counters missing even though compression logs appear?

This mismatch typically means the transform was instantiated without an observer. The source code in headroom/transforms/smart_crusher.py at lines 175-180 expects an observer at initialization, and headroom/transforms/observability.py at lines 18-22 explicitly forbids no-op fallbacks. Pass observer=get_otel_metrics() or the default Prometheus observer when building the transform.

How do I enable DEBUG logging only for Headroom modules?

Call logging.getLogger("headroom").setLevel(logging.DEBUG) instead of changing the global logging configuration. This targets the "headroom" logger and the "headroom.proxy" logger used by the proxy server, limiting noise while surfacing per-transform token counts and compression details.

Can I debug compression behavior without running the full proxy?

Yes, use the SDK's client.chat.completions.simulate(...) method to run a compression pass locally and inspect plan.tokens_before, plan.tokens_after, and plan.transforms. This approach is documented in the troubleshooting guide at wiki/troubleshooting.md and allows you to validate observer wiring and token deltas before deployment.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →