# How to Configure SmartCrusher for JSON Array Compression vs Traditional Compression in Headroom

> Learn how to configure SmartCrusher for JSON array compression in Headroom. Discover its advantages over traditional compression by preserving critical data through change-point detection and key-order guarantees.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: how-to-guide
- Published: 2026-06-10

---

**SmartCrusher preserves structurally critical items in JSON arrays through configurable change-point detection, field matching, and key-order guarantees, unlike traditional compression algorithms that indiscriminately remove data based solely on statistical frequency.**

SmartCrusher is the default JSON-array compressor in the Headroom framework, designed specifically for LLM contexts where aggressive token reduction must not sacrifice semantically relevant data. While traditional compression treats all array elements uniformly, SmartCrusher implements semantic-aware preservation rules that protect headers, footers, and domain-specific fields. This guide explains how to configure these preservation mechanisms through the Rust core and available language SDKs.

## Understanding SmartCrusher Preservation Mechanisms

SmartCrusher operates through three complementary preservation strategies defined in [`crates/headroom-core/src/transforms/smart_crusher/config.rs`](https://github.com/chopratejas/headroom/blob/main/crates/headroom-core/src/transforms/smart_crusher/config.rs). These mechanisms function as additive filters—any item satisfying at least one condition is exempt from removal.

### Change-Point Preservation

The `preserve_change_points` option guarantees that the first and last *N* items of an array remain intact, protecting critical header and footer information such as timestamps, request IDs, and error messages. When enabled (default: `true`), the planner in [`planning.rs`](https://github.com/chopratejas/headroom/blob/main/planning.rs) automatically marks these boundary items as "anchors" that the crusher cannot remove regardless of compression aggressiveness.

### Field-Based Preservation

The `preserve_fields` configuration accepts a list of field names (or SHA-256-derived hashes truncated to 8 bytes) that must be retained if any query token matches the field value. In [`planning.rs`](https://github.com/chopratejas/headroom/blob/main/planning.rs), the function `item_has_preserve_field_match` checks each array element against these hashed field names, marking matches as preservation anchors. This ensures that objects containing specific user IDs, request IDs, or other domain keys survive the compression process.

### Key-Order Preservation

When processing dictionary objects, the `preserve_keys` option maintains the insertion order of specified keys and prevents the removal of entire key-value pairs. The underlying implementation uses `serde_json::preserve_order` as implemented in [`crusher.rs`](https://github.com/chopratejas/headroom/blob/main/crusher.rs), ensuring deterministic round-trips for downstream tooling that expects stable key ordering.

## Configuration Architecture and Source Files

The configuration structure is defined in [`crates/headroom-core/src/transforms/smart_crusher/config.rs`](https://github.com/chopratejas/headroom/blob/main/crates/headroom-core/src/transforms/smart_crusher/config.rs) as `SmartCrusherConfig`, which is injected into the `SmartCrusherPlanner` constructor and subsequently passed to the crusher implementation.

**Key implementation files include:**

- **[`config.rs`](https://github.com/chopratejas/headroom/blob/main/config.rs)**: Defines `SmartCrusherConfig` with boolean and vector fields for the three preservation mechanisms.
- **[`planning.rs`](https://github.com/chopratejas/headroom/blob/main/planning.rs)**: Implements the anchor detection logic around line 17, determining which items are eligible for removal based on the preservation rules.
- **[`crusher.rs`](https://github.com/chopratejas/headroom/blob/main/crusher.rs)**: Executes the actual removal while respecting preservation flags, with the array order guarantee documented at line 109.
- **[`sdk/typescript/src/types/config.ts`](https://github.com/chopratejas/headroom/blob/main/sdk/typescript/src/types/config.ts)**: Exposes TypeScript type definitions that map directly to the Rust configuration struct.

## Practical Configuration Examples

### Python SDK Configuration

When using the Python SDK, instantiate `HeadroomClient` with preservation parameters that forward to the Rust core:

```python
from headroom import HeadroomClient

client = HeadroomClient(
    preserve_change_points=True,  # Keep first/last 10 items

    preserve_fields=["user_id", "request_id"],  # Domain-specific retention

)

messages = [
    {"role": "system", "content": "Header context"},
    # ... hundreds of tool outputs ...

    {"role": "system", "content": "Footer context"},
]

result = client.compress(messages)
print(f"Reduced tokens: {result.tokens_before} → {result.tokens_after}")

```

### TypeScript SDK Configuration

The TypeScript SDK exposes identical options through the `HeadroomConfig` interface:

```typescript
import { HeadroomClient } from "headroom-ai";

const client = new HeadroomClient({
  preserve_change_points: true,
  preserve_keys: ["session_id"],  // Maintain key order
  preserve_fields: ["order_id", "invoice_id"],
});

const messages = [
  { role: "assistant", content: "..." },
  // ... large array ...
];

const result = await client.compress(messages);
console.log(`Compression ratio: ${result.tokensAfter / result.tokensBefore}`);

```

### Low-Level Direct Configuration

For scenarios requiring direct control without the client abstraction, instantiate `SmartCrusherConfig` explicitly:

```python
from headroom import compress, SmartCrusherConfig

config = SmartCrusherConfig(
    preserve_change_points=True,
    preserve_fields=["trace_id"],
    preserve_keys=["timestamp"],
)

payload = [...]  # Your JSON array

compressed = compress(payload, config=config)
print(f"Achieved ratio: {compressed.compression_ratio}")

```

## SmartCrusher vs Traditional Compression

Traditional compression algorithms (such as GZIP or standard JSON minifiers) analyze byte frequency and redundancy, removing whitespace and repetitive patterns without understanding data semantics. SmartCrusher differs fundamentally by operating at the semantic level:

- **Structural awareness**: Traditional methods cannot distinguish between a critical log header and redundant debug output; SmartCrusher preserves change-points and specified fields regardless of their content frequency.
- **Query-aware retention**: Unlike traditional compression, SmartCrusher uses `preserve_fields` to maintain items relevant to the current conversation context, matching field values against active query tokens.
- **Deterministic ordering**: While traditional compression may reorder or collapse objects, SmartCrusher's `preserve_keys` ensures specific fields remain in their original positions using `serde_json::preserve_order`.

## Summary

- **SmartCrusher** is Headroom's default JSON-array compressor that uses semantic preservation rules rather than statistical compression.
- Configure preservation through three mechanisms: `preserve_change_points` (boolean), `preserve_fields` (string list), and `preserve_keys` (string list).
- The configuration struct resides in [`config.rs`](https://github.com/chopratejas/headroom/blob/main/config.rs) and is processed by [`planning.rs`](https://github.com/chopratejas/headroom/blob/main/planning.rs) (anchor detection) and [`crusher.rs`](https://github.com/chopratejas/headroom/blob/main/crusher.rs) (execution).
- Both Python and TypeScript SDKs expose these options directly, mapping to the underlying Rust implementation.
- Enable `preserve_change_points` to protect array boundaries, `preserve_fields` to retain domain-specific objects, and `preserve_keys` to maintain dictionary key ordering.

## Frequently Asked Questions

### What is the difference between SmartCrusher and traditional JSON compression?

Traditional compression algorithms like GZIP or standard minifiers remove whitespace and reduce redundancy based on statistical patterns, treating all data uniformly. SmartCrusher operates at the semantic level, using `preserve_change_points` to protect structural boundaries and `preserve_fields` to retain items containing specific domain keys, ensuring critical data survives aggressive token reduction.

### How does the preserve_fields option work internally?

The `preserve_fields` option hashes each specified field name using SHA-256 (truncated to 8 bytes) and stores these hashes in `preserve_field_hashes`. During the planning stage in [`planning.rs`](https://github.com/chopratejas/headroom/blob/main/planning.rs), the function `item_has_preserve_field_match` checks if any array element contains a value matching a query token against these hashes, marking matches as preservation anchors that [`crusher.rs`](https://github.com/chopratejas/headroom/blob/main/crusher.rs) will not remove.

### Can I use SmartCrusher without the Headroom client?

Yes, you can instantiate `SmartCrusherConfig` directly and pass it to the `compress` function. This low-level approach bypasses the `HeadroomClient` abstraction and is useful when integrating the compression logic into existing data pipelines or when processing JSON payloads that are not part of a standard message flow.

### What happens if multiple preservation rules conflict?

The preservation mechanisms are additive rather than exclusive. If an array item satisfies any single preservation condition—whether through change-point detection, field matching, or key-order requirements—it is automatically exempt from removal. The crusher in [`crusher.rs`](https://github.com/chopratejas/headroom/blob/main/crusher.rs) processes the union of all anchor sets generated by the planner.