# ContentRouter Compression Strategies in Headroom: When to Use Each Compressor

> Explore ten ContentRouter compression strategies in Headroom, from CODE_AWARE to PASSTHROUGH. Learn when to use each compressor for optimal payload routing and content optimization.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: tutorial
- Published: 2026-06-07

---

**TLDR:** The `ContentRouter` in chopratejas/headroom supports ten `CompressionStrategy` values—`CODE_AWARE`, `SMART_CRUSHER`, `SEARCH`, `LOG`, `DIFF`, `HTML`, `KOMPRESS`, `TEXT`, `MIXED`, and `PASSTHROUGH`—that route payloads to specialized compressors based on mixed-content heuristics, content-type detection, and `ContentRouterConfig` toggles.

The `ContentRouter` is the central dispatch component in the chopratejas/headroom project that decides which compression strategy to apply to a given payload. It inspects incoming text, classifies its structure, and delegates to the optimal compressor while respecting user-defined configuration flags. Understanding these ContentRouter compression strategies lets you predict which transformer will run and when to override the defaults.

## How ContentRouter Chooses a Compression Strategy

The decision logic in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) rests on three pillars: mixed-content detection, content-type classification, and configuration overrides.

### Mixed-Content Detection

When a payload contains multiple distinct sections—such as code fences, JSON blocks, search results, and prose—the router selects the `MIXED` strategy. The helper `is_mixed_content()` (lines 124–142) counts content indicators; if it reports two or more, `_determine_strategy()` (lines 1010–1014) routes the input through `split_into_sections()` (lines 124–164), compresses each part with its best-fit strategy, and re-assembles the result.

### Content-Type Detection

For uniform payloads, `_detect_content()` (lines 110–123) uses a Rust-backed detector to classify the whole buffer into a `ContentType` enum. The router then maps that type to a compressor inside `_strategy_from_detection()` (lines 1034–1036). Supported classifications include source code, JSON arrays, search results, build output, git diffs, HTML, and plain text.

### Configuration Flags

Individual strategies can be toggled through `ContentRouterConfig` (lines 380–410). For example, `self.config.enable_code_aware` at line 1004 inside `_apply_strategy_to_content()` determines whether source code receives `CODE_AWARE` processing or falls back to `KOMPRESS`. Similarly, flags like `enable_smart_crusher`, `enable_search_compressor`, `enable_log_compressor`, and `enable_html_extractor` gate their respective compressors, while `prefer_code_aware_for_code` can override the code-aware default.

## Complete List of ContentRouter Compression Strategies

Each enum value maps to a concrete compressor and specific selection criteria.

- **`CODE_AWARE`** – Routes to `CodeAwareCompressor` for AST-preserving source code compression. The router chooses this when `ContentType.SOURCE_CODE` is detected and `config.enable_code_aware` is `True`. If disabled, it falls back to `KOMPRESS` per the override clause in `_strategy_from_detection()`.

- **`SMART_CRUSHER`** – Routes to `SmartCrusher` for high-throughput JSON-array compression. Selected when the detector returns `ContentType.JSON_ARRAY` and `config.enable_smart_crusher` is enabled. If token count does not improve, the router attempts a fallback to `KOMPRESS`.

- **`SEARCH`** – Routes to `SearchCompressor` for grep or ripgrep result compression. Triggered by `ContentType.SEARCH_RESULTS` when `config.enable_search_compressor` is `True`.

- **`LOG`** – Routes to `LogCompressor` for build and test output. Applied to `ContentType.BUILD_OUTPUT` when `config.enable_log_compressor` is active.

- **`DIFF`** – Routes to `DiffCompressor` for git diff compression. Matched on `ContentType.GIT_DIFF` with no explicit flag; this strategy is always available.

- **`HTML`** – Routes to `HtmlExtractor` to extract readable text from HTML. Selected on `ContentType.HTML` when `config.enable_html_extractor` is `True`.

- **`KOMPRESS`** – Routes to `KompressCompressor`, the ML-based token compressor. This is the default for `ContentType.PLAIN_TEXT` and serves as the universal fallback when an explicit strategy is disabled or unavailable.

- **`TEXT`** – An alias that ultimately invokes the same `KompressCompressor` as `KOMPRESS`. Used when the router explicitly labels plain text as `TEXT` rather than the generic `KOMPRESS` fallback.

- **`MIXED`** – Executes internal split-route logic rather than a single compressor. Triggered when `is_mixed_content()` reports at least two content indicators, dispatching any of the above strategies on a per-section basis.

- **`PASSTHROUGH`** – Returns the input unchanged with no compression. Used when the buffer is empty or whitespace-only, or when a strategy is disabled and no fallback is applicable.

## Simplified Routing Decision Flow

The core dispatch logic lives in `_determine_strategy()` and `_strategy_from_detection()`. The flow follows this pattern:

```text
if is_mixed_content → MIXED
else
    detection = _detect_content(content)
    strategy = mapping[detection.content_type]   # see enum → compressor map

    if strategy == CODE_AWARE and not config.prefer_code_aware_for_code:
        strategy = KOMPRESS   # override

```

This logic is implemented in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) at lines 1010–1014 and 1034–1036. The router also respects `fallback_strategy` (default `KOMPRESS`) when no explicit strategy matches.

## Code Examples for ContentRouter Compression Strategies

All examples below execute through `ContentRouter.compress` in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) (lines 808–980).

### Route Source Code to CODE_AWARE

```python
from headroom.transforms import ContentRouter, CompressionStrategy
from headroom.transforms.content_router import ContentRouterConfig

router = ContentRouter()
python_code = "def hello():\n    print('world')\n"
result = router.compress(python_code)
print(result.strategy_used)          # → CompressionStrategy.CODE_AWARE

print(result.compressed)             # AST‑preserving compressed code

```

### Route JSON Arrays to SMART_CRUSHER

```python
json_array = "[\n" + ",\n".join([str(i) for i in range(1000)]) + "\n]"
result = router.compress(json_array)
print(result.strategy_used)          # → CompressionStrategy.SMART_CRUSHER

```

### Route Mixed Documents to MIXED

```python
mixed_doc = ("# Project README\n\n"

             "```python\ndef foo(): pass\n```\n\n"
             "Here is some description.\n\n"
             "```json\n[1,2,3]\n```")
result = router.compress(mixed_doc)
print(result.strategy_used)          # → CompressionStrategy.MIXED

print(result.routing_log)            # shows CODE_AWARE, TEXT, SMART_CRUSHER per section

```

### Override Strategy via ContentRouterConfig

```python
cfg = ContentRouterConfig(enable_code_aware=False, enable_kompress=True)
router = ContentRouter(config=cfg)
result = router.compress(python_code)
print(result.strategy_used)          # → CompressionStrategy.KOMPRESS

```

## Key Implementation Files

The ContentRouter compression strategies are defined across the following modules:

- **[`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py)** – Core router implementation, `CompressionStrategy` enum, mixed-content detection, and routing logic.
- **[`headroom/transforms/content_detector.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_detector.py)** – `ContentType` enum and the Rust-backed `detect_content_type` wrapper consumed by the router.
- **[`headroom/config.py`](https://github.com/chopratejas/headroom/blob/main/headroom/config.py)** – Global configuration and defaults for enabling or disabling each compressor.
- **[`headroom/transforms/base.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/base.py)** – Base `Transform` class that `ContentRouter` inherits from.

Individual compressors are referenced via lazy loaders inside `ContentRouter._apply_strategy_to_content`.

## Summary

- The `ContentRouter` supports ten distinct `CompressionStrategy` values, each mapped to a specialized compressor.
- Routing begins with `is_mixed_content()` in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py); if multiple indicators exist, the payload is split and processed under the `MIXED` strategy.
- Uniform payloads are classified by `_detect_content()` and mapped through `_strategy_from_detection()`, with `ContentRouterConfig` flags gate-keeping optional compressors.
- `KOMPRESS` serves as the default fallback for plain text and for any disabled strategy, while `PASSTHROUGH` handles empty or unsupported inputs.

## Frequently Asked Questions

### What is the default ContentRouter compression strategy when no content type matches?

When the router cannot match a specific content type, it falls back to `KOMPRESS` via the `fallback_strategy` default. This routes the payload to the `KompressCompressor`, an ML-based token compressor that handles generic plain text.

### How do I disable the AST-preserving code compressor and force ML-based compression?

Set `enable_code_aware=False` inside `ContentRouterConfig` when instantiating `ContentRouter`. According to the override clause in `_strategy_from_detection()` at lines 1034–1036, source code will then route to `KOMPRESS` instead of `CODE_AWARE`.

### What happens when a document contains both code fences and JSON blocks?

If `is_mixed_content()` detects two or more distinct content indicators, the router selects the `MIXED` strategy. It calls `split_into_sections()` in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) and compresses each section with its own optimal strategy before re-assembling the final output.

### Is there a way to bypass all compression and return the original text?

Yes. The `PASSTHROUGH` strategy returns the input unchanged. The router automatically selects it for empty or whitespace-only buffers, or when a targeted strategy is disabled and no fallback is applicable.