How to Debug Compression Issues with Headroom's Observability and Logging
Enable DEBUG logging on the headroom logger, ensure every transform receives a CompressionObserver, and cross-reference structured log messages with per-strategy Prometheus counters to isolate missing or failed compression steps.
Headroom is an open-source LLM proxy and compression toolkit that instruments every compression step with structured logs and mandatory observer callbacks. To debug compression issues with Headroom's observability and logging, you need to correlate Python logging output from transforms like SmartCrusher with metrics recorded by the CompressionObserver interface. The source code in chopratejas/headroom makes this straightforward by emitting detailed token counts and enforcing observer injection at the transform level.
How Headroom's Compression Observability Works
Headroom instruments every compression step through two complementary mechanisms: a Python logging.Logger named "headroom" (or "headroom.proxy" for the proxy) and a lightweight CompressionObserver interface. After the ContentRouter selects a transform and the Rust-backed compressor in headroom/_core finishes processing, the transform invokes the observer and emits a structured log message simultaneously. This dual emission means you can verify behavior in logs and independently confirm it in metrics.
The Mandatory CompressionObserver Protocol
The CompressionObserver protocol is defined in headroom/transforms/observability.py at lines 39-78. It expects a single callback per real compression event, and the codebase explicitly forbids a no-op fallback (see the comment in observability.py at lines 18-22). The default implementation in headroom/proxy/prometheus_metrics.py increments a per-strategy counter via record_compression(), which instantly surfaces regressions such as a missing SmartCrusher event.
Per-Transform Structured Logging
Each transform emits detailed DEBUG/INFO messages through a module-level logging.Logger. For example, headroom/transforms/smart_crusher.py at line 46 creates logger = logging.getLogger(__name__), and headroom/transforms/log_compressor.py at line 45 does the same. These loggers report which transform ran, the token counts before and after, and any errors encountered during the Rust-backed compression pass.
Debug Compression Issues Using Headroom Logs and Metrics
-
Enable full logging. Set the root logger or the
"headroom"logger to DEBUG. The default logging bootstrap lives inheadroom/proxy/server.pyat line 183. -
Verify that each transform logs. Look for messages like
"Compressing with smart_crusher"that display the original and compressed token counts. Each transform creates its own logger vialogger = logging.getLogger(__name__), as seen at the top ofheadroom/transforms/smart_crusher.pyat line 46. -
Check Prometheus counters. Query the
headroom_compression_totalmetric to see per-strategy counts. The observer implementation lives inheadroom/proxy/prometheus_metrics.py, whererecord_compressionincrements the counter. -
Use the SDK simulate helper. Run
client.chat.completions.simulate(...)and inspectplan.tokens_before,plan.tokens_after, andplan.transforms. The simulation code is exercised in the troubleshooting guide atwiki/troubleshooting.md. -
Pull stored metrics. Call
client.get_metrics(...)to retrieve a time-series of token savings and compression events. The metric API is defined inheadroom/client.py. -
Correlate logs with observer data. Match log timestamps with Prometheus counter increments. A missing increment despite a log entry means the observer was not wired, which commonly happens when a custom transform forgets to pass
observer. TheCompressionObservercontract is documented inheadroom/transforms/observability.pyat lines 44-57. If you see zero compression events in Prometheus but the logs show that a compressor executed, the transform was likely instantiated without an observer (see the note inheadroom/transforms/smart_crusher.pyat lines 175-180). Passingobserver=get_otel_metrics()or the default Prometheus observer fixes this.
Essential Code Examples for Headroom Observability
Enable Debug Logging
import logging
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s %(name)s %(levelname)s %(message)s",
)
# Or target only Headroom logs:
logging.getLogger("headroom").setLevel(logging.DEBUG)
Attach an Observer to a Smart Crusher Instance
from headroom.transforms.smart_crusher import SmartCrusher, SmartCrusherConfig
from headroom.observability import get_otel_metrics # or Prometheus observer
crusher = SmartCrusher(
config=SmartCrusherConfig(),
observer=get_otel_metrics() # ensures metrics are emitted
)
Simulate a Compression Pass
plan = client.chat.completions.simulate(
model="gpt-4o",
messages=[{"role": "user", "content": "...."}],
)
print(f"Tokens: {plan.tokens_before} → {plan.tokens_after}")
print("Transforms applied:", plan.transforms)
Retrieve Recent Metrics
from datetime import datetime, timedelta
metrics = client.get_metrics(
start_time=datetime.utcnow() - timedelta(hours=1),
limit=20,
)
for m in metrics:
print(f"{m.timestamp}: {m.tokens_input_before} → {m.tokens_input_after}")
print(" Compression events:", m.compression_events)
Summary
- Debug compression issues with Headroom's observability and logging by correlating structured DEBUG logs from the
"headroom"logger with per-strategy Prometheus counters emitted through theCompressionObserverprotocol. - Always verify that transforms like
SmartCrusherreceive a validobserverargument, because the codebase forbids no-op fallbacks and missing observers are the most common source of silent metric regressions. - Use the
client.chat.completions.simulate()helper to inspect token deltas and transform plans locally before deploying changes. - Query historical data via
client.get_metrics()to identify trends in token savings and compression event frequency.
Frequently Asked Questions
What does the CompressionObserver protocol do in Headroom?
The CompressionObserver protocol in headroom/transforms/observability.py defines a single callback, record_compression(), that receives one notification per real compression event. The default implementation increments a Prometheus counter, giving operators a real-time signal of which strategies are active and whether any step has been bypassed.
Why are my Prometheus counters missing even though compression logs appear?
This mismatch typically means the transform was instantiated without an observer. The source code in headroom/transforms/smart_crusher.py at lines 175-180 expects an observer at initialization, and headroom/transforms/observability.py at lines 18-22 explicitly forbids no-op fallbacks. Pass observer=get_otel_metrics() or the default Prometheus observer when building the transform.
How do I enable DEBUG logging only for Headroom modules?
Call logging.getLogger("headroom").setLevel(logging.DEBUG) instead of changing the global logging configuration. This targets the "headroom" logger and the "headroom.proxy" logger used by the proxy server, limiting noise while surfacing per-transform token counts and compression details.
Can I debug compression behavior without running the full proxy?
Yes, use the SDK's client.chat.completions.simulate(...) method to run a compression pass locally and inspect plan.tokens_before, plan.tokens_after, and plan.transforms. This approach is documented in the troubleshooting guide at wiki/troubleshooting.md and allows you to validate observer wiring and token deltas before deployment.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →