how-to-guide

How to Configure the compress() Function with CompressConfig for Different Compression Strategies

June 5, 2026 chopratejas/headroom ↗

The compress() function in Headroom uses a CompressConfig dataclass to control which messages get compressed, how aggressively the compression is applied, and which ML model handles the text reduction, enabling precise control over the pipeline's behavior.

The compress() function serves as the single-entry API for Headroom’s compression pipeline in the chopratejas/headroom repository. Whether you are optimizing long conversational contexts for LLM APIs or reducing token counts for specialized domains like code or logs, you configure the behavior through the CompressConfig dataclass. This configuration object determines eligibility rules, protection windows, and compression aggressiveness that cascade through the internal transform pipeline.

Understanding CompressConfig and the Compression Pipeline

The CompressConfig dataclass, defined in headroom/compress.py (lines 76–135), encapsulates all user-facing tuning knobs. It declares boolean flags such as compress_user_messages and compress_system_messages to specify which roles are eligible for modification, alongside protect_recent to preserve the last N messages in their original form. For the ML-based text compression stage (Kompress), it exposes target_ratio to set the desired retention percentage, min_tokens_to_compress to filter short messages, and kompress_model to select specific HuggingFace models or disable the step entirely.

When you invoke compress() (lines 158–168), the function merges any provided keyword arguments into a CompressConfig instance and passes it to the TransformPipeline. This pipeline, constructed in lines 38–46, orchestrates the ContentRouter and specific compressors. The router—implemented in headroom/transforms/content_router.py—reads the boolean eligibility flags and protection window to determine which messages reach the compression engines, while headroom/transforms/kompress_compressor.py handles the ML-based reduction using the target_ratio and kompress_model settings.

Four Compression Strategies with Code Examples

Strategy 1: Protect User Input and Recent Context (Coding Assistant)

For interactive coding assistants, preserving the user’s exact queries and recent turns maintains conversation coherence. This strategy leaves user messages untouched while compressing older system and assistant content.

from headroom import compress, CompressConfig

# Default config: compress_user_messages=False, protect_recent=4

config = CompressConfig()
result = compress(messages, model="gpt-4o", config=config)

print(f"Compression ratio: {result.compression_ratio}")

Key configuration fields:

compress_user_messages=False – Excludes user-role messages from compression.
protect_recent=4 – Leaves the last four messages uncompressed regardless of role.

Strategy 2: High-Fidelity Document Retention (50% Target)

When processing large financial reports or research documents where context retention outweighs token savings, increase the target_ratio to preserve roughly half of the content.

config = CompressConfig(
    compress_user_messages=True,  # Compress large user uploads

    target_ratio=0.5,             # Retain ~50% of tokens

    protect_recent=0,             # No protection window

)
result = compress(messages, model="claude-opus-4-20250514", config=config)

Strategy 3: Aggressive Log and Search Result Compression

For high-volume log streams or search results where token minimization is critical, apply aggressive ratios and optionally specify a domain-specific model.

config = CompressConfig(
    compress_user_messages=True,
    target_ratio=0.2,                      # Retain only 20%

    protect_recent=0,
    kompress_model="my-org/kompress-logs"  # Custom HuggingFace model

)
result = compress(messages, model="gpt-4o", config=config)

Setting kompress_model="disabled" skips the ML-based Kompress stage entirely, forcing the pipeline to rely on rule-based compressors defined in headroom/transforms/code_compressor.py and headroom/transforms/log_compressor.py.

Strategy 4: Inline Keyword Arguments for Quick Tuning

For ad-hoc adjustments without instantiating a config object, pass parameters directly to compress(). The function merges these kwargs into a default CompressConfig internally (lines 202–207).

result = compress(
    messages,
    model="gpt-4o",
    compress_user_messages=True,
    target_ratio=0.3,
    protect_recent=2,
)

Summary

The CompressConfig dataclass in headroom/compress.py centralizes all compression parameters, from eligibility flags (compress_user_messages) to ML aggressiveness (target_ratio).
The compress() function forwards this configuration to a TransformPipeline that routes messages through specialized compressors based on content type.
Use protect_recent to preserve conversational context, target_ratio to control retention percentage, and kompress_model to select or disable the ML backend.
You can pass a CompressConfig instance or inline keyword arguments depending on your reuse requirements.

Frequently Asked Questions

What parameters does CompressConfig accept?

CompressConfig accepts boolean toggles for compress_user_messages and compress_system_messages, integers for protect_recent and min_tokens_to_compress, floats for target_ratio, and strings for kompress_model. It also includes protect_analysis_context to guard messages containing analytical reasoning. See the dataclass definition in headroom/compress.py lines 76–135 for the complete schema.

How does target_ratio affect compression?

The target_ratio parameter directly controls the Kompress ML compressor in headroom/transforms/kompress_compressor.py. A value of 0.2 instructs the model to retain approximately 20% of the original token count, while 0.5 retains roughly half. This ratio applies only to messages routed through the Kompress stage; specialized compressors for code and logs use their own heuristics.

Can I use kwargs instead of creating a CompressConfig object?

Yes. The compress() entry point accepts any CompressConfig field as a keyword argument. Internally, it instantiates a default config and updates it with your kwargs (lines 158–168), allowing one-off adjustments without explicit object construction.

How do I disable the ML-based compression entirely?

Set kompress_model="disabled" in your CompressConfig. This prevents the pipeline from loading the Kompress transformer, forcing the router to rely solely on rule-based compressors like CodeCompressor and log compressors for token reduction.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how chopratejas/headroom works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →