How to Configure SmartCrusher to Preserve Relevant Items During JSON Array Compression

Configure SmartCrusher by setting preserve_change_points, preserve_fields, and preserve_keys in the SmartCrusherConfig struct to protect header/footer items, specific field values, and key order during JSON array compression.

SmartCrusher is the default JSON array compressor in the Headroom repository, designed to shrink large tool-output payloads while retaining data critical for LLM processing. When you configure SmartCrusher to preserve relevant items, you ensure that timestamps, user IDs, and structural boundaries survive aggressive token reduction.

Understanding SmartCrusher Preservation Mechanisms

SmartCrusher implements three complementary preservation strategies defined in crates/headroom-core/src/transforms/smart_crusher/config.rs. These mechanisms work additively—any array item satisfying at least one preservation condition is marked as an anchor and exempt from removal.

Change-Point Preservation with preserve_change_points

The preserve_change_points option guards the first and last N items of an array (default 10), protecting structural boundaries like timestamps, IDs, and error messages. When enabled (default true), the planner in planning.rs automatically marks these boundary items as "must-keep" regardless of compression scoring. This ensures that log stream headers, sentinel entries, and footer summaries survive compression.

Field-Based Preservation with preserve_fields

The preserve_fields option accepts a list of field names (or SHA-256-derived hashes) that trigger retention when any query token matches the field value. For example, configuring preserve_fields = ["user_id", "request_id"] ensures every object containing those identifiers survives compression. The matching logic resides in the item_has_preserve_field_match function within planning.rs, which checks if element values contain substrings matching the preserved field hashes.

Key-Order Preservation with preserve_keys

When processing dictionary objects, preserve_keys maintains the insertion order of specified keys and prevents their removal. The underlying Rust implementation uses serde_json::preserve_order to guarantee deterministic key ordering, which is critical for downstream tooling that expects stable JSON structures. This option is exposed in sdk/typescript/src/types/config.ts for JavaScript/TypeScript users.

Where the Preservation Logic Lives

The preservation pipeline spans three core files in the Headroom codebase:

Configuration Examples by Language

Python SDK Configuration

When using the headroom-ai Python package, pass preservation options directly to HeadroomClient or the compress function:

from headroom import HeadroomClient, SmartCrusherConfig

# Configure via client constructor

client = HeadroomClient(
    preserve_change_points=True,  # Keep first/last 10 items (default)

    preserve_fields=["user_id", "request_id"],  # Retain items with these fields

)

# Or use the low-level compress function

config = SmartCrusherConfig(
    preserve_change_points=True,
    preserve_fields=["trace_id"],
    preserve_keys=["session_id"],  # Maintain key order

)
result = compress(payload, config=config)
print(f"Compression ratio: {result.compression_ratio}")

TypeScript/JavaScript SDK Configuration

The TypeScript SDK exposes the same options through the type definitions in sdk/typescript/src/types/config.ts:

import { HeadroomClient } from "headroom-ai";

const client = new HeadroomClient({
  preserve_change_points: true,
  preserve_fields: ["order_id", "invoice_id"],
  preserve_keys: ["session_id"],  // Preserve key order in objects
});

const result = await client.compress(messages);
console.log(`Reduced from ${result.tokensBefore} to ${result.tokensAfter} tokens`);

Summary

  • SmartCrusher provides three additive preservation mechanisms: change-point guarding, field-based matching, and key-order stabilization.
  • Configure preservation options in SmartCrusherConfig (Rust), HeadroomClient (Python/TypeScript), or the low-level compress function.
  • Change-point preservation (preserve_change_points) protects the first and last 10 array items by default.
  • Field-based preservation (preserve_fields) uses SHA-256 hashing to retain items containing specific field values.
  • The preservation logic is implemented in planning.rs (anchor detection) and crusher.rs (execution), with configuration defined in config.rs.

Frequently Asked Questions

What is the default behavior for change-point preservation?

By default, preserve_change_points is set to true in config.rs, which automatically protects the first and last 10 items of any JSON array from removal. This ensures that header metadata, timestamps, and footer summaries survive compression even under aggressive token reduction.

How does field-based preservation matching work?

When you specify preserve_fields, SmartCrusher hashes each field name using SHA-256 (truncated to 8 bytes) and stores these in preserve_field_hashes. During processing in planning.rs, the item_has_preserve_field_match function checks if any array element's values contain substrings matching query tokens associated with those fields, marking matching items as anchors.

Can I combine multiple preservation strategies?

Yes. The three preservation mechanisms are additive—an item is exempt from removal if it satisfies any single condition. You can enable preserve_change_points while also specifying preserve_fields and preserve_keys simultaneously to create multiple safety nets for critical data.

Does key-order preservation affect compression ratio?

No. The preserve_keys option only affects serialization using serde_json::preserve_order and does not impact which items are removed. It ensures that specified keys maintain their original position in the output, preventing structural reordering that might confuse downstream parsers, without altering the compression ratio of array items.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →