# Headroom Compression Algorithms: SmartCrusher vs Kompress-Base Differences Explained

> Explore Headroom compression algorithms like SmartCrusher and Kompress-Base. Understand their unique Rust-backed and transformer-based implementations for JSON and text reduction.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: deep-dive
- Published: 2026-06-09

---

**Headroom provides two complementary compression pipelines—SmartCrusher for deterministic JSON array compaction and Kompress-Base for neural token-level text reduction—each with distinct Rust-backed and transformer-based implementations that share a unified CCR retrieval store.**

The `chopratejas/headroom` repository implements specialized compression strategies to shrink LLM prompts without discarding critical information. Understanding the differences between **Headroom compression algorithms** SmartCrusher and Kompress-Base is essential for tuning tool-output processing. While SmartCrusher operates on structured arrays via a compiled Rust core, Kompress-Base leverages a ModernBERT-style model to score and drop low-importance tokens in plain text.

## SmartCrusher: Schema-Aware JSON Array Compression

### Implementation and Configuration

In [`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py), the `SmartCrusher` class acts as a thin Python shim over a Rust extension compiled with PyO3. It forwards raw JSON array strings to `headroom._core.SmartCrusher` and handles TOIN telemetry logging plus CCR marker insertion. Compression behavior is controlled by `SmartCrusherConfig`, which exposes fine-grained thresholds for variance, uniqueness, similarity, and maximum item counts that mirror the underlying Rust defaults.

### Compression Strategy

SmartCrusher follows a **lossless-first** path that attempts to compact arrays using schema-preserving encodings such as CSV-style compaction. If that approach yields savings below 30%, the compressor falls back to a **lossy row-drop** strategy, emitting a `<<ccr:HASH …>>` marker and storing culled rows for later retrieval. The retained payload includes sentinels like `{"_ccr_dropped": …}` so downstream consumers know data was elided.

### Telemetry and Fallback Behavior

After a successful non-passthrough crush, the compressor records a compression event to TOIN, learning from array shape and strategy outcome. If the compiled Rust extension is unavailable, the import fails immediately; there is no silent fallback because the compressor cannot operate without the native crate.

## Kompress-Base: Token-Level Neural Text Compression

### Implementation and Configuration

The `KompressCompressor` class in [`headroom/transforms/kompress_compressor.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/kompress_compressor.py) loads a ModernBERT-style transformer model—either an ONNX INT8 runtime or a PyTorch backend—auto-downloaded from HuggingFace under the identifier `chopratejas/kompress-base`. Behavior is governed by `KompressConfig`, which sets the model ID, inference device (`auto`, `cpu`, or `gpu`), chunk size, score threshold, and CCR flag.

### Compression Strategy

Unlike SmartCrusher, Kompress-Base is **score-based**: the model predicts a keep-mask per token, and tokens scoring below the configured threshold are stripped. An optional `target_ratio` can enforce a fixed keep percentage. Whenever the resulting **compression ratio drops below 0.8**, the compressor appends a CCR marker and stores the original text via `_store_in_ccr`, mirroring the retrieval semantics of the JSON path.

### Telemetry and Fallback Behavior

Each successful compression triggers `_kompress_content_signature`, logging the original text signature to TOIN for adaptive learning. Runtime resilience differs from SmartCrusher: if neither ONNX nor PyTorch is installed, an informative `ImportError` is raised, but if model loading fails after installation the compressor silently falls back to passthrough, returning the original text unchanged. ONNX CPU inference is lightweight yet lacks batch-dimension speed-up, whereas PyTorch on GPU delivers roughly **2× to 2.3× speed-up** for batched inference.

## Key Differences at a Glance

- **Primary target**: SmartCrusher tackles large **JSON arrays** such as tool-output tables and logs; Kompress-Base targets long **plain-text** narration and assistant messages.
- **Algorithmic approach**: SmartCrusher uses deterministic Rust logic with variance and similarity heuristics; Kompress-Base uses a trained neural model to score individual token importance.
- **CCR integration**: SmartCrusher embeds JSON sentinels and mirrors the Rust store into Python’s `compression_store`; Kompress-Base appends a plain-text marker and stores originals via `_store_in_ccr`.
- **Failure mode**: SmartCrusher fails loudly if the Rust extension is missing; Kompress-Base raises `ImportError` for missing backends but falls back to passthrough on model load failure.

## Practical Code Examples

### Using SmartCrusher on JSON Arrays

```python
from headroom.transforms.smart_crusher import SmartCrusher, SmartCrusherConfig

cfg = SmartCrusherConfig(
    min_items_to_analyze=10,
    max_items_after_crush=12,
    preserve_change_points=False,
)

crusher = SmartCrusher(config=cfg)
json_array = '[{"a": 1, "b": 2}, {"a": 3, "b": 4}, {"a": 5, "b": 6}]'

result = crusher.crush(json_array, query="summarize")
print("Compressed:", result.compressed)
print("Strategy:", result.strategy)

```

### Using Kompress-Base on Plain Text

```python
from headroom.transforms.kompress_compressor import KompressCompressor, KompressConfig

compressor = KompressCompressor()
long_text = "Lorem ipsum ... (very long tool output)"

result = compressor.compress(long_text)
print("Compressed:", result.compressed)
print("Ratio:", result.compression_ratio)
print("CCR key:", result.cache_key)

```

### Orchestrating Both in a Pipeline

```python
from headroom.transforms.pipeline import Pipeline
from headroom.transforms.smart_crusher import SmartCrusher
from headroom.transforms.kompress_compressor import KompressCompressor

pipeline = Pipeline(
    compressors=[
        SmartCrusher(),
        KompressCompressor(),
    ]
)

messages = [
    {"role": "tool", "content": json_array},
    {"role": "assistant", "content": long_text},
]

compressed = pipeline.apply(messages, tokenizer=my_tokenizer)

```

The `Pipeline` class automatically routes structured tool output to `SmartCrusher` and free-form text to `KompressCompressor` according to the content-type routing rules defined in [`headroom/transforms/pipeline.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/pipeline.py).

## Summary

- **SmartCrusher** is a Rust-backed, schema-aware compressor designed for JSON arrays, using lossless compaction and configurable row-drops with CCR markers.
- **Kompress-Base** is a ModernBERT-based neural compressor for plain text that drops low-scoring tokens and supports ONNX and PyTorch inference backends.
- Both algorithms store originals in Headroom’s `compression_store` when aggressive compression fires, enabling later retrieval via CCR hashes.
- `SmartCrusherConfig` and `KompressConfig` provide fine-grained thresholds for array heuristics and model inference respectively.
- The unified `Pipeline` in [`pipeline.py`](https://github.com/chopratejas/headroom/blob/main/pipeline.py) binds both compressors so the correct strategy is applied per message content type.

## Frequently Asked Questions

### What is the main difference between SmartCrusher and Kompress-Base?

SmartCrusher compresses structured JSON arrays by removing redundant rows while preserving schema, whereas Kompress-Base is a neural text compressor that evaluates token importance in unstructured plain text. According to the `chopratejas/headroom` source code, they target entirely different payload shapes and rely on distinct backends—Rust heuristics versus transformer inference.

### How does Headroom decide when to store the original content using CCR?

Both compressors trigger Content-Cache-Retrieval storage when compression yields significant reduction. SmartCrusher emits a CCR marker if lossless savings are under 30% and rows are dropped, while Kompress-Base appends a marker whenever the compression ratio falls below 0.8. In both cases the original payload is hashed and stored in [`headroom/cache/compression_store.py`](https://github.com/chopratejas/headroom/blob/main/headroom/cache/compression_store.py).

### Can Kompress-Base run without a GPU?

Yes. The `KompressCompressor` defaults to ONNX INT8 CPU inference if no GPU is available, as configured via the `device` field in `KompressConfig`. PyTorch GPU acceleration is optional and provides a 2× to 2.3× batched speed-up, but the compressor remains fully functional on CPU-only hosts.

### What happens if the SmartCrusher Rust extension is missing?

The Python import of `headroom._core.SmartCrusher` will fail immediately with an import error. Unlike Kompress-Base, SmartCrusher offers no passthrough fallback because its logic is executed entirely inside the compiled Rust crate; the extension must be present for JSON array compression to proceed.