How ContentRouter Routes Different Content Types to Optimal Compressors in Headroom
Headroom's ContentRouter detects the input payload's content type, maps it to a CompressionStrategy, and dispatches it to a purpose-built compressor such as SmartCrusher for JSON or Kompress for plain text, with automatic splitting and reassembly for mixed payloads.
In the chopratejas/headroom repository, the ContentRouter serves as the central dispatcher for the entire compression pipeline. It inspects every incoming payload to determine whether it is plain text, source code, JSON, log data, or a mixture of formats, then sends each segment to the compressor best suited for that structure. This intelligent routing happens inside headroom/transforms/content_router.py and is guided by a configurable strategy map with built-in fallback chains.
Content-Type Detection in ContentRouter
The routing process begins with classification. The ContentRouter first evaluates whether an input contains multiple distinct formats, then runs a dedicated detection pass for homogeneous payloads.
Mixed-Content Testing
The router calls is_mixed_content to check whether the input contains multiple distinct formats. If this test returns True, the router immediately branches to the mixed-content handler instead of attempting single-type classification. You can find this guard logic around line 524 in headroom/transforms/content_router.py.
Rust-Based Detection and Regex Fallback
For inputs that are not mixed, the router runs _detect_content at lines 10-30 in headroom/transforms/content_router.py. This method calls the Rust-based headroom._core.detect_content_type for fast classification, and when the detector yields plain text, it applies a regex-based fallback defined in headroom/transforms/content_detector.py. The result is a ContentType enum value that feeds the next stage.
Strategy Mapping from ContentType to CompressionStrategy
Once a content type is identified, _strategy_from_detection at lines 27-36 translates it into a CompressionStrategy. This mapping is static but can be overridden by router configuration, such as the prefer_code_aware_for_code flag that prioritizes the code-aware compressor over the ML-based Kompress. The mapping table lives in headroom/transforms/content_router.py and directly links each ContentType to its optimal handler.
Routing and Compression Execution
With a strategy selected, the router delegates to one of two execution paths based on whether the payload is pure or mixed.
Pure Content Path
When the strategy is anything other than MIXED, _compress_pure at lines 124-162 takes over. It delegates to _apply_strategy_to_content at lines 173-245, which lazily loads the appropriate compressor—code-aware, SmartCrusher, Search, Log, Diff, HTML extractor, or Kompress. If a compressor is disabled or unavailable, the router applies a graceful fallback, usually Kompress, and records the full fallback chain in the strategy_chain.
Mixed Content Path
If is_mixed_content returns True, _compress_mixed at lines 48-115 splits the payload into typed sections using split_into_sections at lines 44-112. Each section is re-routed independently with _strategy_from_detection_type, compressed, and finally reassembled. The routing decisions for every subsection are stored in a RoutingDecision list that becomes part of the RouterCompressionResult.
Configuration and Observability
The ContentRouterConfig dataclass at lines 20-84 controls which compressors are enabled, whether to prefer code-aware logic, the mixed-content threshold, and the fallback strategy. The router respects these flags when instantiating compressors.
Every decision is wrapped in a RoutingDecision record at lines 13-22 and emitted to an optional CompressionObserver via _observe. The router also records statistics in a two-tier CompressionCache to skip already-compressed or non-compressible payloads.
Practical Code Examples
The following examples demonstrate how different inputs trigger different compressors.
from headroom.transforms import ContentRouter, CompressionStrategy
# 1️⃣ Simple plain-text – routed to Kompress (the default text compressor)
plain = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
router = ContentRouter()
result = router.compress(plain)
print(result.strategy_used) # → CompressionStrategy.KOMPRESS
print(result.compressed) # compressed text
print(result.savings_percentage) # token-saving stats
# 2️⃣ JSON array – automatically uses SmartCrusher
json_array = "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]"
result = router.compress(json_array)
assert result.strategy_used == CompressionStrategy.SMART_CRUSHER
# 3️⃣ Mixed payload (code fence + prose + JSON)
mixed = (
'```python\n'
'def hello():\n'
' print("Hi")\n'
'```\n'
'\n'
'Here is some introductory text.\n'
'\n'
'{\n'
' "name": "alice",\n'
' "age": 30\n'
'}\n'
)
result = router.compress(mixed)
print(result.strategy_used) # → CompressionStrategy.MIXED
for decision in result.routing_log:
print(decision.strategy, decision.content_type)
# → CODE_AWARE for the fence, KOMPRESS for the prose, SMART_CRUSHER for JSON
from headroom.config import ContentRouterConfig
# Force the router to treat everything as plain text (use Kompress)
config = ContentRouterConfig(
enable_code_aware=False,
enable_smart_crusher=False,
fallback_strategy=CompressionStrategy.KOMPRESS,
)
router = ContentRouter(config=config)
result = router.compress(mixed) # Still mixed, but each section falls back to Kompress
Summary
- Content-type detection is the first gate: the router checks for mixed content via
is_mixed_content, then runs_detect_contentwith a Rust-based detector and regex fallback. - Strategy mapping converts the detected
ContentTypeinto aCompressionStrategythrough_strategy_from_detection, respecting configuration overrides. - Pure payloads are compressed by
_compress_pureand_apply_strategy_to_content, with lazy loading and graceful fallback to Kompress when a specific compressor is unavailable. - Mixed payloads are split by
split_into_sections, independently compressed by_compress_mixed, and reassembled with a fullRoutingDecisionlog. - Observability and caching are built in via
RoutingDecision,CompressionObserver, andCompressionCacheto record decisions and avoid redundant work.
Frequently Asked Questions
How does ContentRouter detect mixed content?
The router calls is_mixed_content around line 524 in headroom/transforms/content_router.py before running any format-specific detection. If the input contains multiple distinct content types, this guard returns True and the payload is sent to _compress_mixed for section-based processing rather than single-strategy compression.
What compressors are available in Headroom's ContentRouter?
According to the source in headroom/transforms/content_router.py, _apply_strategy_to_content can lazily load several purpose-built compressors: the code-aware compressor, SmartCrusher for structured data, Search and Log compressors, a Diff compressor, an HTML extractor, and the ML-based Kompress engine that serves as the default fallback.
What happens when a compressor is disabled or unavailable?
If the mapped compressor is turned off or fails to load, _apply_strategy_to_content applies a graceful fallback—typically Kompress—and records the entire fallback sequence in the strategy_chain field of the result. This ensures every payload is handled even when preferred compressors are missing.
How can I configure ContentRouter to prefer specific compressors?
You can pass a ContentRouterConfig instance to the ContentRouter constructor. The dataclass at lines 20-84 of headroom/transforms/content_router.py exposes boolean flags such as enable_code_aware and enable_smart_crusher, the prefer_code_aware_for_code toggle, and a fallback_strategy enum value to customize the routing logic.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →