deep-dive

Headroom Compression Algorithms: SmartCrusher vs Kompress-Base Differences Explained

June 9, 2026 chopratejas/headroom ↗

Headroom provides two complementary compression pipelines—SmartCrusher for deterministic JSON array compaction and Kompress-Base for neural token-level text reduction—each with distinct Rust-backed and transformer-based implementations that share a unified CCR retrieval store.

The chopratejas/headroom repository implements specialized compression strategies to shrink LLM prompts without discarding critical information. Understanding the differences between Headroom compression algorithms SmartCrusher and Kompress-Base is essential for tuning tool-output processing. While SmartCrusher operates on structured arrays via a compiled Rust core, Kompress-Base leverages a ModernBERT-style model to score and drop low-importance tokens in plain text.

SmartCrusher: Schema-Aware JSON Array Compression

Implementation and Configuration

In headroom/transforms/smart_crusher.py, the SmartCrusher class acts as a thin Python shim over a Rust extension compiled with PyO3. It forwards raw JSON array strings to headroom._core.SmartCrusher and handles TOIN telemetry logging plus CCR marker insertion. Compression behavior is controlled by SmartCrusherConfig, which exposes fine-grained thresholds for variance, uniqueness, similarity, and maximum item counts that mirror the underlying Rust defaults.

Compression Strategy

SmartCrusher follows a lossless-first path that attempts to compact arrays using schema-preserving encodings such as CSV-style compaction. If that approach yields savings below 30%, the compressor falls back to a lossy row-drop strategy, emitting a <<ccr:HASH …>> marker and storing culled rows for later retrieval. The retained payload includes sentinels like {"_ccr_dropped": …} so downstream consumers know data was elided.

Telemetry and Fallback Behavior

After a successful non-passthrough crush, the compressor records a compression event to TOIN, learning from array shape and strategy outcome. If the compiled Rust extension is unavailable, the import fails immediately; there is no silent fallback because the compressor cannot operate without the native crate.

Kompress-Base: Token-Level Neural Text Compression

Implementation and Configuration

The KompressCompressor class in headroom/transforms/kompress_compressor.py loads a ModernBERT-style transformer model—either an ONNX INT8 runtime or a PyTorch backend—auto-downloaded from HuggingFace under the identifier chopratejas/kompress-base. Behavior is governed by KompressConfig, which sets the model ID, inference device (auto, cpu, or gpu), chunk size, score threshold, and CCR flag.

Compression Strategy

Unlike SmartCrusher, Kompress-Base is score-based: the model predicts a keep-mask per token, and tokens scoring below the configured threshold are stripped. An optional target_ratio can enforce a fixed keep percentage. Whenever the resulting compression ratio drops below 0.8, the compressor appends a CCR marker and stores the original text via _store_in_ccr, mirroring the retrieval semantics of the JSON path.

Telemetry and Fallback Behavior

Each successful compression triggers _kompress_content_signature, logging the original text signature to TOIN for adaptive learning. Runtime resilience differs from SmartCrusher: if neither ONNX nor PyTorch is installed, an informative ImportError is raised, but if model loading fails after installation the compressor silently falls back to passthrough, returning the original text unchanged. ONNX CPU inference is lightweight yet lacks batch-dimension speed-up, whereas PyTorch on GPU delivers roughly 2× to 2.3× speed-up for batched inference.

Key Differences at a Glance

Primary target: SmartCrusher tackles large JSON arrays such as tool-output tables and logs; Kompress-Base targets long plain-text narration and assistant messages.
Algorithmic approach: SmartCrusher uses deterministic Rust logic with variance and similarity heuristics; Kompress-Base uses a trained neural model to score individual token importance.
CCR integration: SmartCrusher embeds JSON sentinels and mirrors the Rust store into Python’s compression_store; Kompress-Base appends a plain-text marker and stores originals via _store_in_ccr.
Failure mode: SmartCrusher fails loudly if the Rust extension is missing; Kompress-Base raises ImportError for missing backends but falls back to passthrough on model load failure.

Practical Code Examples

Using SmartCrusher on JSON Arrays

from headroom.transforms.smart_crusher import SmartCrusher, SmartCrusherConfig

cfg = SmartCrusherConfig(
    min_items_to_analyze=10,
    max_items_after_crush=12,
    preserve_change_points=False,
)

crusher = SmartCrusher(config=cfg)
json_array = '[{"a": 1, "b": 2}, {"a": 3, "b": 4}, {"a": 5, "b": 6}]'

result = crusher.crush(json_array, query="summarize")
print("Compressed:", result.compressed)
print("Strategy:", result.strategy)

Using Kompress-Base on Plain Text

from headroom.transforms.kompress_compressor import KompressCompressor, KompressConfig

compressor = KompressCompressor()
long_text = "Lorem ipsum ... (very long tool output)"

result = compressor.compress(long_text)
print("Compressed:", result.compressed)
print("Ratio:", result.compression_ratio)
print("CCR key:", result.cache_key)

Orchestrating Both in a Pipeline

from headroom.transforms.pipeline import Pipeline
from headroom.transforms.smart_crusher import SmartCrusher
from headroom.transforms.kompress_compressor import KompressCompressor

pipeline = Pipeline(
    compressors=[
        SmartCrusher(),
        KompressCompressor(),
    ]
)

messages = [
    {"role": "tool", "content": json_array},
    {"role": "assistant", "content": long_text},
]

compressed = pipeline.apply(messages, tokenizer=my_tokenizer)

The Pipeline class automatically routes structured tool output to SmartCrusher and free-form text to KompressCompressor according to the content-type routing rules defined in headroom/transforms/pipeline.py.

Summary

SmartCrusher is a Rust-backed, schema-aware compressor designed for JSON arrays, using lossless compaction and configurable row-drops with CCR markers.
Kompress-Base is a ModernBERT-based neural compressor for plain text that drops low-scoring tokens and supports ONNX and PyTorch inference backends.
Both algorithms store originals in Headroom’s compression_store when aggressive compression fires, enabling later retrieval via CCR hashes.
SmartCrusherConfig and KompressConfig provide fine-grained thresholds for array heuristics and model inference respectively.
The unified Pipeline in pipeline.py binds both compressors so the correct strategy is applied per message content type.

Frequently Asked Questions

What is the main difference between SmartCrusher and Kompress-Base?

SmartCrusher compresses structured JSON arrays by removing redundant rows while preserving schema, whereas Kompress-Base is a neural text compressor that evaluates token importance in unstructured plain text. According to the chopratejas/headroom source code, they target entirely different payload shapes and rely on distinct backends—Rust heuristics versus transformer inference.

How does Headroom decide when to store the original content using CCR?

Both compressors trigger Content-Cache-Retrieval storage when compression yields significant reduction. SmartCrusher emits a CCR marker if lossless savings are under 30% and rows are dropped, while Kompress-Base appends a marker whenever the compression ratio falls below 0.8. In both cases the original payload is hashed and stored in headroom/cache/compression_store.py.

Can Kompress-Base run without a GPU?

Yes. The KompressCompressor defaults to ONNX INT8 CPU inference if no GPU is available, as configured via the device field in KompressConfig. PyTorch GPU acceleration is optional and provides a 2× to 2.3× batched speed-up, but the compressor remains fully functional on CPU-only hosts.

What happens if the SmartCrusher Rust extension is missing?

The Python import of headroom._core.SmartCrusher will fail immediately with an import error. Unlike Kompress-Base, SmartCrusher offers no passthrough fallback because its logic is executed entirely inside the compiled Rust crate; the extension must be present for JSON array compression to proceed.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how chopratejas/headroom works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →