How to Debug Compression Issues in Headroom and View Saved Tokens
Enable CCRConfig with ccr_enabled=True and ccr_inject_marker=True, then use get_compression_store().get_stats() to inspect token savings and retrieve(hash) to verify cached entries in the Python CompressionStore.
Headroom's compression pipeline optimizes token usage for LLM interactions through a multi-layered architecture involving content routing, strategy-specific compression, and reversible caching. When compression fails to deliver expected savings or CCR markers like <<ccr:abc123…>> cannot be retrieved, developers need systematic debugging approaches. This guide walks through diagnosing compression issues in the chopratejas/headroom repository using actual source file paths and runtime inspection techniques.
Understanding Headroom's Compression Pipeline
Transform Selection via ContentRouter
The process begins in headroom/transforms/content_router.py where the ContentRouter class analyzes raw output and routes it to the appropriate compressor. The routing decision is recorded in a RouterCompressionResult containing the original text, compressed text, chosen CompressionStrategy, and a detailed routing_log.
Compression Execution and Result Types
Each compressor returns specific dataclasses containing metrics. The SmartCrusher (for JSON arrays) and other compressors return objects like CrushResult or CompressionResult that include original_tokens, compressed_tokens, and compression_ratio. For JSON data, SmartCrusher in headroom/transforms/smart_crusher.py also emits CCR markers when configured.
The CCR Store Architecture
The CompressionStore in headroom/cache/compression_store.py provides in-process caching for dropped or substituted data. It tracks original and compressed token counts, retrieval statistics, and TTL (default 5 minutes) per entry. The store serves as the backbone for Headroom's "Compress-Cache-Retrieve" (CCR) pattern.
Why Debugging Often Fails
Several architectural behaviors commonly mask compression issues:
- Silent Fallbacks: The
ContentRoutermay fall back to less aggressive compressors (e.g.,KompressafterSmartCrusher) when compression ratios are too low, bypassing the CCR store. - Missing Markers: CCR markers only appear when
CCRConfighasccr_enabled=Trueandccr_inject_marker=Trueenabled. - TTL Eviction: Entries disappear after 5 minutes or via LRU logic, causing "404" retrieval errors in long-running sessions.
How to View Saved Tokens and Cache Statistics
Access the singleton store and inspect aggregate or entry-specific metrics:
-
Obtain the store instance:
from headroom.cache.compression_store import get_compression_store store = get_compression_store() -
Inspect aggregate statistics:
stats = store.get_stats() print(f"Original tokens cached: {stats['total_original_tokens']:,}") print(f"Compressed tokens cached: {stats['total_compressed_tokens']:,}") print(f"Cache entries: {stats['entry_count']}") -
Inspect a specific entry using the CCR hash from
<<ccr:HASH …>>markers:entry = store.retrieve('abc123') if entry: print(f"Original content size (tokens): {entry.original_tokens}") print(f"Compressed content size (tokens): {entry.compressed_tokens}") print(f"Retrieval count: {entry.retrieval_count}") -
List recent retrieval events to confirm the LLM requested dropped data:
recent = store.get_retrieval_events(limit=5) for ev in recent: print(f"[{ev.retrieval_type}] hash={ev.hash[:8]} tokens={ev.items_retrieved}/{ev.total_items}") -
Clear the store (test environments) to eliminate stale TTLs:
store.clear()
Step-by-Step Debugging Workflow
Follow this sequence to isolate compression failures:
-
Enable DEBUG logging to capture routing decisions in
ContentRouter.compressvia_log_router_debug. -
Verify CCRConfig settings ensure
ccr_enabled=Trueandccr_inject_marker=True. -
Locate CCR markers like
<<ccr:HASH …>>in the compressed output fromSmartCrusher._mirror_ccr_to_python_store. -
Retrieve entries using
CompressionStore.retrieve(hash)to inspect token counts. -
Check aggregate stats via
store.get_stats(); zero values indicate passthrough fallback due tomin_compression_ratio_for_ccrnot being met. -
Inspect eviction logs in
store._eviction_heapor checkstore._clean_expiredif entries are missing.
Complete Debugging Code Example
import logging
import re
from headroom.transforms.content_router import ContentRouter
from headroom.cache.compression_store import get_compression_store
# 1️⃣ Enable detailed debug output
logging.basicConfig(level=logging.DEBUG)
router = ContentRouter()
# Example payload that triggers SmartCrusher (JSON array)
payload = """
[
{"id": 1, "name": "Alice", "role": "admin"},
{"id": 2, "name": "Bob", "role": "user"},
{"id": 3, "name": "Carol", "role": "user"}
]
"""
# Run the router – it will log the chosen strategy and any CCR markers
result = router.compress(payload, context="list users")
print("Chosen strategy:", result.strategy_used.value)
print("Compressed output:", result.compressed)
# 2️⃣ Extract CCR hash from the output (if any)
match = re.search(r"<<ccr:([0-9a-f]+)", result.compressed)
if match:
ccr_hash = match.group(1)
store = get_compression_store()
entry = store.retrieve(ccr_hash)
if entry:
print("\n--- CCR Entry ---")
print("Original tokens:", entry.original_tokens)
print("Compressed tokens:", entry.compressed_tokens)
print("Retrieval count:", entry.retrieval_count)
else:
print("CCR entry not found (might have expired).")
else:
print("No CCR marker emitted – compression probably fell back to passthrough.")
# 3️⃣ View aggregate stats
stats = store.get_stats()
print("\n--- Store Stats ---")
print(f"Total original tokens cached: {stats['total_original_tokens']:,}")
print(f"Total compressed tokens cached: {stats['total_compressed_tokens']:,}")
print(f"Cache entries: {stats['entry_count']}")
Key Source Files for Reference
headroom/transforms/content_router.py– Orchestrates detection and strategy selectionheadroom/transforms/smart_crusher.py– Rust-backed JSON compressionheadroom/cache/compression_store.py– In-process cache with TTL and retrieval logsheadroom/utils.py– Helper functions for marker creation and hash computationwiki/text-compression.md– Architecture overviewwiki/troubleshooting.md– Common failure diagnostics
Summary
- Enable CCR markers and DEBUG logging to trace routing decisions in
ContentRouter. - Use
get_compression_store().get_stats()to verify aggregate token savings across all cached entries. - Retrieve specific entries by hash to confirm data persistence and view
compression_ratio. - Check for TTL expiration and fallback strategies when entries are missing from the store.
- Reference
headroom/transforms/content_router.py,headroom/transforms/smart_crusher.py, andheadroom/cache/compression_store.pyfor implementation details.
Frequently Asked Questions
Why are my CCR markers not appearing in the output?
Markers require explicit configuration in CCRConfig with both ccr_enabled=True and ccr_inject_marker=True. If the ContentRouter falls back to a passthrough strategy due to low compression ratios, no CCR entry is created and markers are omitted.
How do I check if compression actually saved tokens?
Call store.get_stats() on the singleton CompressionStore instance returned by get_compression_store(). The dictionary contains total_original_tokens and total_compressed_tokens showing the aggregate savings across all cached entries.
Why does retrieval return None for a valid CCR hash?
The CompressionStore uses TTL (5-minute default) and LRU eviction. Entries expire between creation and retrieval, especially in long-running sessions. Check store._eviction_heap or enable detailed retrieval logging to confirm expiration events.
Can I clear the compression cache during testing?
Yes. Call store.clear() on the CompressionStore instance to reset all entries, eviction heaps, and statistics. This is useful when debugging to eliminate stale TTL concerns.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →