# SmartCrusher vs Kompress-Base: JSON and Text Compression in Headroom

> Compare SmartCrusher for JSON compression and Kompress-Base for text. Discover their speed, methods, and performance differences for efficient data handling.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: deep-dive
- Published: 2026-06-07

---

**SmartCrusher uses deterministic statistical reduction written in Rust to compress JSON arrays at sub-millisecond speed, while Kompress-Base employs a ModernBERT token-classification model via ONNX to shrink arbitrary plain text, trading inference latency for higher compression on unstructured content.**

The `chopratejas/headroom` repository provides two distinct compression strategies for minimizing token volume before LLM inference. While both integrate with Headroom’s CCR (Compress-Cache-Retrieve) and TOIN (Telemetry-Optimized-Information-Network) systems, they target different data structures and employ fundamentally opposing architectures—one relying on native Rust statistics and the other on learned neural compression.

## Core Architecture and Algorithm

### SmartCrusher for JSON Arrays

In [`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py), SmartCrusher wraps a Rust-native implementation (`headroom._core.SmartCrusher`) that performs statistical-driven reduction on JSON-style arrays returned by tool calls. The algorithm preserves first and last items, errors, outliers, and relevance-scored entries while sampling the remainder to maintain representativeness. This deterministic approach achieves **70%–95% compression** on typical tool outputs like search results or logs.

### Kompress-Base for Plain Text

Located in [`headroom/transforms/kompress_compressor.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/kompress_compressor.py), Kompress-base utilizes a ModernBERT token-classification model trained on token-level compression tasks. It runs inference through **ONNX** (or optional PyTorch/CoreML) to classify and remove redundant tokens from arbitrary strings, paragraphs, or markdown. This ML-based approach delivers **80%–95% reduction** on general text but requires the `[ml]` extra and approximately **50 MB–200 MB** of model weights.

## Performance and Resource Requirements

SmartCrusher executes in approximately **1 ms per array** because processing occurs entirely in compiled Rust with no ML library overhead. Kompress-base requires **50 ms–200 ms per block** due to neural inference and ONNX session initialization, making it noticeably slower but suitable for final-stage compression of remaining plain-text payloads.

Installation footprints diverge significantly. SmartCrusher adds roughly **5 MB** as a compiled Python extension. Kompress-base requires `pip install "headroom-ai[ml]"` to pull `onnxruntime` and download the `chopratejas/kompress-base` model (referenced at line 39 via `HF_MODEL_ID`), resulting in a larger dependency tree.

## Configuration API and Fallback Behavior

SmartCrusher exposes explicit control through the `SmartCrusherConfig` dataclass, allowing fine-grained tuning of `min_items_to_analyze`, `max_items_after_crush`, and `preserve_change_points` flags. If the Rust extension fails to load, the transform becomes unavailable as a hard import failure.

Kompress-base relies on environment variables such as `HEADROOM_KOMPRESS_BACKEND` and optional constructor arguments like `batch_size` rather than a dedicated configuration class. It implements graceful degradation: if the model cannot be loaded or an error occurs during inference, the compressor returns the original text unchanged in passthrough mode.

## Pipeline Integration and Telemetry

Both compressors integrate with Headroom’s caching and telemetry layers but handle them differently. SmartCrusher emits sentinel objects (`{"_ccr_dropped": "..."}`) directly within the Rust core, which are later stripped by `strip_ccr_sentinels`. It also calls `toin.record_compression()` after real compression events (see lines 23–26 of [`smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/smart_crusher.py)).

Kompress-base generates TOIN signatures via `_kompress_content_signature` (implemented in lines 50–84 of [`kompress_compressor.py`](https://github.com/chopratejas/headroom/blob/main/kompress_compressor.py)) and handles CCR markers at the pipeline level rather than injecting sentinels into the compressed text itself.

In the default Headroom pipeline architecture, SmartCrusher operates at **Stage 3** (immediately before the Context Manager) to shrink structured tool outputs, while Kompress-base runs optionally at **Stage 4** as a final ML-driven layer for unstructured text reduction.

## Practical Usage Examples

### Compressing JSON Arrays with SmartCrusher

```python
from headroom import SmartCrusher, SmartCrusherConfig

cfg = SmartCrusherConfig(
    min_items_to_analyze=5,
    max_items_after_crush=15,
    preserve_change_points=True,
)

crusher = SmartCrusher(config=cfg)

# Example: Compress a large array of 1,000 search results

tool_output = {"results": [{"title": f"Result {i}", "score": i} for i in range(1_000)]}
compressed = crusher.crush(tool_output, query="best restaurants in NYC")

print(compressed.was_modified)  # True

# The returned JSON retains only ~15 representative items

```

### Compressing Plain Text with Kompress-Base

```python
from headroom.transforms.kompress_compressor import KompressCompressor

# Auto-downloads chopratejas/kompress-base on first use

compressor = KompressCompressor()

long_text = "Lorem ipsum..." * 1000  # Large text block or log output

result = compressor.compress(long_text)

print(result.compressed)  # Shortened version with 80-95% fewer tokens

print(result.original == long_text)  # False (compression applied)

```

## Summary

- **SmartCrusher** targets JSON arrays using deterministic Rust-based statistical trimming, offering sub-millisecond latency and minimal dependencies.
- **Kompress-Base** targets arbitrary plain text using ModernBERT token classification via ONNX, providing higher compression ratios at the cost of inference latency and larger installation footprint.
- SmartCrusher uses `SmartCrusherConfig` for explicit control and fails hard on import errors; Kompress-base uses environment variables and fails gracefully with passthrough.
- Both support Headroom’s CCR caching and TOIN telemetry, but SmartCrusher injects sentinel objects while Kompress-base handles CCR at the pipeline level.
- In the default pipeline, SmartCrusher runs at Stage 3 (structured data), while Kompress-base operates optionally at Stage 4 (final text compression).

## Frequently Asked Questions

### Can I use SmartCrusher on non-JSON text?

No. SmartCrusher is specifically designed for JSON-style arrays and objects returned by tool calls, as implemented in the Rust core. For arbitrary plain text, markdown, or log files, you should use Kompress-base or the standard Context Manager transforms.

### Why is Kompress-base slower than SmartCrusher?

Kompress-base performs neural inference using a ModernBERT model through ONNX runtime, which requires loading model weights and running token-level classification for each block. SmartCrusher executes purely in compiled Rust code using statistical heuristics, requiring no model loading or GPU acceleration, resulting in approximately **1 ms** versus **50–200 ms** per operation.

### What happens if the Kompress-base model fails to download?

The compressor falls back to passthrough mode, returning the original text unchanged without raising an exception. This ensures pipeline stability even when ML dependencies are unavailable, though you should monitor TOIN telemetry signatures to detect when compression is skipped.

### Do I need both compressors in my Headroom pipeline?

No. SmartCrusher is included by default and handles structured JSON reduction at Stage 3. Kompress-base is optional and only necessary if you require additional compression on unstructured text content after the Context Manager stage. Most pipelines benefit from SmartCrusher alone; add Kompress-base only when processing long-form text outputs that exceed token limits.