# TransformPipeline Architecture in Headroom: How to Customize the LLM Compression Pipeline

> Explore the Headroom TransformPipeline architecture for LLM compression. Customize this deterministic sequence of composable transforms to reduce token usage and preserve information effectively.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: architecture
- Published: 2026-06-10

---

**The Headroom TransformPipeline processes every LLM request through a deterministic sequence of composable transforms that reduce token usage while preserving critical information, with each stage implementing a common `Transform` protocol to produce a `TransformResult`.**

The `chopratejas/headroom` library implements a sophisticated compression system that sits between your application and LLM providers. Understanding the TransformPipeline architecture allows you to balance cost reduction against context preservation by rearranging, configuring, or extending the built-in transform stages.

## Core Pipeline Architecture

The pipeline architecture centers on a simple but powerful abstraction: each transform is a stateless class that implements an `apply(messages)` method returning a `TransformResult`. The orchestration logic in [`headroom/transforms/pipeline.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/pipeline.py) executes these transforms sequentially, passing the output of one stage as input to the next.

### The Transform Protocol

Every transform conforms to a consistent interface defined in the pipeline module. As implemented in [`headroom/transforms/pipeline.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/pipeline.py), the `TransformPipeline` accepts a Python list of transform instances and invokes them in order:

```python
from headroom import TransformPipeline, CacheAligner, SmartCrusher, RollingWindow

pipeline = TransformPipeline([
    CacheAligner(),      # Stabilize cache-friendly prefix

    SmartCrusher(),      # Compress JSON tool outputs

    RollingWindow(),     # Enforce token budget

])
result = pipeline.transform(messages)
print(f"Saved {result.tokens_saved} tokens")

```

Each transform receives the message list, applies its specific compression logic, and returns a `TransformResult` containing the modified messages and metadata about tokens saved.

### Stage Execution Order

The order of transforms matters significantly. The recommended architecture places cache stabilization first, followed by content-specific compression, optional ML-based compression, and finally token limit enforcement. Reordering—for example, placing `CacheAligner` after `SmartCrusher`—would waste opportunities to stabilize the system prompt for provider-side caching.

## Built-in Transform Stages

The Headroom repository provides five distinct transform classes, each targeting a specific source of token waste.

### CacheAligner

Located in [`headroom/transforms/cache_aligner.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/cache_aligner.py), the **CacheAligner** detects and extracts dynamic content (such as dates, UUIDs, or timestamps) from the system prompt. It moves these volatile elements to a *dynamic* suffix, keeping the static prefix stable. This enables provider-side caching (OpenAI, Anthropic, Google) to hit repeatedly, saving up to approximately 90% of request costs for identical prompt prefixes.

### SmartCrusher

The **SmartCrusher** transform in [`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py) statistically analyzes JSON-like tool outputs. Rather than truncating arbitrarily, it intelligently preserves the most valuable items: first and last entries, error items, statistical outliers, query-relevant items, and change points. This approach cuts huge tool-result payloads by 70–95% while guaranteeing that spikes, errors, and other critical data survive the compression.

### ContentRouter

Found in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py), the optional **ContentRouter** acts as a smart dispatcher. It examines the remaining content’s type—whether code, logs, or plain text—and automatically routes it to the best compressor. It can invoke `CodeAwareCompressor`, `LogCompressor`, or the optional **Kompress** ML compressor if installed. Enable specific routers through configuration flags such as `enable_code_aware` and `enable_log_compression`.

### RollingWindow

The **RollingWindow** transform in [`headroom/transforms/rolling_window.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/rolling_window.py) enforces the model’s token limit through deterministic truncation. It drops whole tool-call/result pairs starting from the oldest, while always preserving the system message and the most recent user/assistant turns. This guarantees the final payload fits the context window without breaking tool-call ordering dependencies.

### IntelligentContextManager

An advanced option in [`headroom/transforms/intelligent_context.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/intelligent_context.py), the **IntelligentContextManager** scores each message on recency, semantic similarity, TOIN-learned importance, error detection, forward-references, and token density. Instead of dropping merely the oldest messages, it removes the lowest-scored messages. This retains semantically important content—such as an early error message—even when it appears far from the tail of the conversation.

## How to Customize the TransformPipeline

The pipeline architecture supports three primary customization strategies: reordering stages, tuning configurations, and injecting optional compressors.

### Reordering and Selecting Transforms

Customize the pipeline by modifying the list passed to `TransformPipeline`. You can omit stages that don't apply to your workload or reorder them to change processing priority:

```python

# Minimal pipeline without ContentRouter or IntelligentContextManager

pipeline = TransformPipeline([
    CacheAligner(),
    RollingWindow(),
])

```

### Configuration Tuning

Each transform exposes a configuration dataclass (e.g., `SmartCrusherConfig`, `CacheAlignerConfig`, `RollingWindowConfig`). Pass these when constructing transforms to tune behavior:

```python
from headroom import SmartCrusherConfig, CacheAlignerConfig, RollingWindowConfig

pipeline = TransformPipeline([
    CacheAligner(config=CacheAlignerConfig(
        extract_dates=True, 
        stable_prefix_min_tokens=120
    )),
    SmartCrusher(config=SmartCrusherConfig(
        min_tokens_to_crush=150, 
        keep_first=5
    )),
    RollingWindow(config=RollingWindowConfig(
        max_tokens=100_000, 
        preserve_recent_turns=6
    )),
])

```

### Adding ML-Based Compression

To include the ML-based `KompressCompressor`, install the optional dependency and import from [`headroom/transforms/kompress_compressor.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/kompress_compressor.py):

```bash
pip install "headroom-ai[ml]"

```

```python
from headroom.transforms.kompress_compressor import KompressCompressor

pipeline = TransformPipeline([
    CacheAligner(),
    SmartCrusher(),
    KompressCompressor(),   # ML-based text compression

    RollingWindow(),
])

```

## Summary

- The **TransformPipeline** in [`headroom/transforms/pipeline.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/pipeline.py) orchestrates a sequence of composable transforms that each implement an `apply(messages)` method.
- **CacheAligner** stabilizes the prompt prefix for provider caching, while **SmartCrusher** compresses JSON tool outputs by 70–95%.
- **ContentRouter** automatically selects the best compression strategy based on content type, and **RollingWindow** enforces hard token limits.
- Customize the pipeline by reordering the transform list, passing configuration dataclasses to individual transforms, or injecting the optional **KompressCompressor** from the `[ml]` extra.
- All source files reside under `headroom/transforms/` in the `chopratejas/headroom` repository.

## Frequently Asked Questions

### How does the TransformPipeline maintain safety guarantees when compressing context?

The pipeline maintains safety through deterministic, rules-based transforms rather than black-box compression. Each stage—whether `CacheAligner` or `SmartCrusher`—uses specific heuristics (error detection, change-point analysis, token density scoring) to ensure critical information like errors, outliers, and recent turns survive the compression process.

### Can I use the TransformPipeline without installing the ML dependencies?

Yes. The `KompressCompressor` located in [`headroom/transforms/kompress_compressor.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/kompress_compressor.py) is entirely optional. The core pipeline consisting of `CacheAligner`, `SmartCrusher`, and `RollingWindow` functions without any ML libraries. Only import `KompressCompressor` if you have installed the `[ml]` extra via `pip install "headroom-ai[ml]"`.

### What happens if I place RollingWindow before SmartCrusher in the pipeline?

Placing `RollingWindow` before `SmartCrusher` would cause the system to truncate the message list based on age before analyzing JSON content for compressibility. You would lose the opportunity to score and preserve important items within large tool results, potentially dropping critical error messages that `SmartCrusher` would have identified and retained.

### Where are the configuration classes defined for each transform?

Configuration dataclasses such as `SmartCrusherConfig`, `CacheAlignerConfig`, and `RollingWindowConfig` are defined alongside their respective transform implementations in [`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py), [`headroom/transforms/cache_aligner.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/cache_aligner.py), and [`headroom/transforms/rolling_window.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/rolling_window.py). Import them directly from the `headroom` package namespace as shown in the customization examples.