SmartCrusher vs CodeCompressor vs Kompress-Base: Headroom Compression Algorithms Explained
SmartCrusher is the deterministic, Rust-powered code compressor for JSON arrays, while Kompress-base is the ModernBERT-based token compressor for plain text; they differ in input format, algorithm, and performance profile.
If you are asking what the difference is between SmartCrusher, CodeCompressor, and Kompress-base compression algorithms in the chopratejas/headroom repository, the short answer is that SmartCrusher functions as the code compressor for structured arrays, whereas Kompress-base is a learned token-selection model for free-form text. Both transforms reduce token count in LLM context windows, but they target fundamentally different data shapes and are implemented with separate technology stacks.
What Is SmartCrusher?
SmartCrusher is a code compressor that targets JSON arrays produced by LLM tool calls. It preserves the outer array schema while applying rule-based heuristics to drop redundant rows and, when necessary, insert retrieval markers.
The heavy lifting lives in the native Rust crate crates/headroom-core, which is exposed to Python via PyO3. The Python shim in headroom/transforms/smart_crusher.py forwards calls to the Rust implementation. Configuration is defined in the SmartCrusherConfig dataclass (lines 95–134), and the wrapper class SmartCrusher appears at lines 139–150. The algorithm analyzes array statistics—item count, token count, variance, uniqueness, and similarity—and uses thresholds such as min_items_to_analyze and variance_threshold to decide which rows to remove. No machine learning model is involved.
What Is Kompress-Base?
Kompress-base is a token compressor designed for plain text, long assistant messages, and unstructured tool output. Instead of rule-based heuristics, it uses a trained ModernBERT model to predict which words to keep and which to discard.
The implementation is located in headroom/transforms/kompress_compressor.py. Configuration is handled by KompressConfig (lines 529–550) and the main entry point is the KompressCompressor class starting at lines 575–596. The underlying model can run on ONNX Runtime with int8 quantization or on PyTorch with GPU acceleration, and includes batching and semaphore limits to avoid oversubscribing the device.
Core Differences Between SmartCrusher and Kompress-Base
Primary Purpose and Input Format
- SmartCrusher expects a JSON array (
[ {...}, ... ]) and operates on the array level. It keeps the schema intact but may remove individual rows. - Kompress-base accepts any raw string. It tokenises the text into words and processes fixed-size chunks controlled by the
chunk_wordsparameter.
Algorithm and Model Architecture
- SmartCrusher applies deterministic heuristics based on JSON array statistics. It may drop rows and emit a CCR sentinel such as
{"_ccr_dropped": "..."}. The core logic delegates throughself._rust.crush, and the Rust entry pointcrush_array_jsonreturns the structured result. - Kompress-base loads a dual-head ModernBERT defined in
_get_model_class()(lines ~28–46). Atoken_headpredicts per-token keep/discard probabilities, while aspan_convCNN predicts span importance. Final keep scores are computed astoken_probs * (0.5 + 0.5 * span_scores).
Output Format and CCR Integration
- SmartCrusher returns a JSON-encoded array plus a CCR marker when rows are dropped. The Rust side holds a CCR store, and the Python shim mirrors entries via
_mirror_ccr_to_python_store(lines ~72–90). The marker can later be resolved through/v1/retrieve. - Kompress-base returns a shortened string. CCR is optional: when
enable_ccris true and the compression ratio falls below a threshold,_store_in_ccr(lines ~1150–1156) appends a hash marker and persists it in the Pythoncompression_storedefined inheadroom/cache/compression_store.py.
Performance Profile
- SmartCrusher is pure Rust, offering very low latency with no GPU or CPU oversubscription concerns.
- Kompress-base inference is GPU-accelerated via PyTorch or runs on CPU through ONNX Runtime. The code includes batching and semaphore limits to prevent overwhelming the device.
How to Use SmartCrusher for JSON Arrays
from headroom.transforms.smart_crusher import (
SmartCrusherConfig,
smart_crush_tool_output,
)
cfg = SmartCrusherConfig(
min_items_to_analyze=10,
max_items_after_crush=12,
similarity_threshold=0.85,
)
crushed, was_modified, info = smart_crush_tool_output(
content=tool_output, # JSON string representing an array
config=cfg,
)
print("Compressed JSON:", crushed)
print("Strategy used:", info) # e.g. "smart:row_drop"
The helper smart_crush_tool_output is implemented at lines 996–1009.
How to Use Kompress-Base for Plain-Text Compression
from headroom.transforms.kompress_compressor import (
KompressCompressor,
KompressConfig,
)
cfg = KompressConfig(
model_id="chopratejas/kompress-base",
chunk_words=350,
score_threshold=0.5,
enable_ccr=True,
)
compressor = KompressCompressor(config=cfg)
result = compressor.compress(long_text)
print("Compressed text:", result.compressed)
print(f"Saved {result.tokens_saved} tokens ({result.savings_percentage:.1f}%).")
The compress method entry point is located at lines 602–630.
Configuration Options
Both algorithms expose tunable settings through dedicated dataclasses.
- SmartCrusherConfig (
headroom/transforms/smart_crusher.py, lines 95–134): controlsmin_items_to_analyze,variance_threshold,similarity_threshold,max_items_after_crush, and related rule thresholds. - KompressConfig (
headroom/transforms/kompress_compressor.py, lines 529–550): controlsmodel_id,chunk_words,score_threshold, device selection, and CCR toggling.
Shared infrastructure such as CCRConfig and TransformResult lives in headroom/config.py, while the central compression_store.py handles marker persistence for both transforms when CCR is enabled.
Summary
- SmartCrusher is a fast, rule-based Rust compressor for structured JSON arrays produced by tool calls.
- Kompress-base is a ModernBERT-powered token-selection engine that shortens free-form text while preserving informative words.
- SmartCrusher tightly integrates CCR through its Rust core; Kompress-base offers optional CCR via the Python
compression_store. - Use SmartCrusher when you need schema-preserving compression for tabular or array-shaped JSON data.
- Use Kompress-base when you need to shrink unstructured natural language or long assistant messages.
Frequently Asked Questions
Does SmartCrusher use a machine learning model?
No. SmartCrusher relies entirely on rule-based heuristics that analyze JSON array statistics such as item count, variance, and similarity. The algorithm is implemented in the crates/headroom-core Rust crate and exposed to Python through PyO3, making it deterministic and model-free.
Can Kompress-base run on CPU-only systems?
Yes. While Kompress-base can use PyTorch for GPU acceleration, it also supports ONNX Runtime with int8 quantization for CPU inference. The code uses batching and semaphore limits to prevent oversubscribing the available device.
What does CCR mean in these compressors?
According to the headroom source code, CCR refers to a retrieval marker system used for context caching. SmartCrusher emits CCR sentinels like {"_ccr_dropped": "..."} directly from the Rust core, while Kompress-base optionally appends hash markers that are stored in headroom/cache/compression_store.py for later retrieval.
When should I choose SmartCrusher over Kompress-base?
Choose SmartCrusher when your data is a structured JSON array—such as database query results or tool outputs with rigid schemas—and you need low-latency, schema-preserving compression. Choose Kompress-base when your input is unstructured plain text or lengthy assistant messages where a learned model can selectively drop less important tokens.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →