# How to Configure Headroom for Specific Compression Ratios Using target_ratio

> Configure Headroom for specific compression ratios using target_ratio. Learn how to control token retention and achieve precise compression levels in your text summarization.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: how-to-guide
- Published: 2026-06-05

---

**Pass a `target_ratio` value between 0 and 1 to Headroom's `compress()` function to retain a specific fraction of tokens, where 0.3 keeps 30% of the original words and achieves approximately 70% compression.**

The open-source Headroom library (chopratejas/headroom) provides deterministic token reduction for LLM contexts through its configurable compression pipeline. By specifying a `target_ratio` parameter, you can control exactly what proportion of your original text survives the compression process, making it ideal for managing context window limits in conversational AI applications.

## Understanding the target_ratio Parameter

The `target_ratio` parameter accepts a float between 0 and 1 that represents the **fraction of tokens to preserve**. For example, setting `target_ratio=0.25` instructs the compressor to retain the top 25% of tokens and remove the remaining 75%. When you omit this parameter, the underlying model falls back to its internal `score_threshold` to determine retention dynamically.

## How target_ratio Propagates Through the Compression Pipeline

Headroom implements a three-stage architecture to process the compression ratio request from the API surface down to the token selection algorithm.

### Entry Point in headroom/compress.py

The public `compress()` function in [`headroom/compress.py`](https://github.com/chopratejas/headroom/blob/main/headroom/compress.py) accepts `target_ratio` as a direct argument and forwards it into the transform pipeline. This serves as the primary user-facing interface for ratio-based compression.

### Routing via ContentRouter

The `ContentRouter` transform in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) extracts the runtime `target_ratio` kwarg from the request context on **line 1475** and injects it into each downstream compressor. This routing layer ensures the ratio propagates correctly regardless of which specific compression backend processes the text.

### Token Selection in KompressCompressor

The actual compression logic resides in [`headroom/transforms/kompress_compressor.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/kompress_compressor.py), where the `KompressCompressor` class implements deterministic top-k selection. On **line 695**, the code calculates `num_keep = int(num_tokens × target_ratio)` and retains exactly that many highest-scoring tokens. When `target_ratio` is provided, this overrides any model-specific thresholding behavior.

## Code Examples for Configuring Compression Ratios

### Basic Usage with the compress() Helper

```python
from headroom.compress import compress

messages = [
    {"role": "assistant", "content": "A very long answer …"},
    {"role": "tool", "content": "Results from a heavy computation …"},
]

# Keep only 25% of the original tokens

compressed = compress(messages, model="gpt-4o", target_ratio=0.25)

print(compressed)   # → list of messages with shortened content

```

### Manual Pipeline Configuration

```python
from headroom.transforms.kompress_compressor import KompressCompressor
from headroom.transforms.content_router import ContentRouter

# Build a pipeline that includes the ContentRouter and Kompress

router = ContentRouter()
kompress = KompressCompressor()

# Supply the ratio through the router's kwargs

router_kwargs = {"target_ratio": 0.4}   # keep 40% of tokens

# The router forwards the kwarg to KompressCompressor internally

router.apply(messages, transformer=kompress, **router_kwargs)

```

### Batch Processing with Variable Ratios

```python
from headroom.transforms.kompress_compressor import KompressCompressor

compressor = KompressCompressor()
texts = [
    "First long paragraph …",
    "Second long paragraph …",
    "Third short one.",
]

# Provide a list – each entry corresponds to the respective text

ratios = [0.3, 0.5, None]   # third text uses the model's default decision

results = compressor.compress_batch(texts, target_ratio=ratios)

for r in results:
    print(r.compression_ratio, r.compressed)

```

## Summary

- **Valid Range**: `target_ratio` accepts float values from 0 to 1, interpreted as the fraction of tokens to retain.
- **Pipeline Flow**: The parameter travels from [`headroom/compress.py`](https://github.com/chopratejas/headroom/blob/main/headroom/compress.py) → `ContentRouter` (line 1475) → `KompressCompressor` (line 695).
- **Deterministic Output**: When specified, `target_ratio` triggers a top-k selection that keeps exactly `int(num_tokens × target_ratio)` tokens.
- **Optional Override**: Omitting `target_ratio` allows the model to use its internal `score_threshold` instead.
- **Batch Support**: Pass a list of ratios to `compress_batch()` for per-text granularity in batch operations.

## Frequently Asked Questions

### What happens if I set target_ratio to 0 or 1?

Setting `target_ratio=0` removes all tokens, resulting in empty content, while `target_ratio=1` preserves the entire text with no compression. Values outside the 0–1 range may raise validation errors depending on the specific version of the chopratejas/headroom repository.

### Does target_ratio work with all Headroom transformers?

While `target_ratio` is primarily implemented in `KompressCompressor`, the `ContentRouter` in [`headroom/transforms/content_router.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/content_router.py) routes this parameter to any downstream transform that respects the convention. Other compressors like `SmartCrusher` in [`headroom/transforms/smart_crusher.py`](https://github.com/chopratejas/headroom/blob/main/headroom/transforms/smart_crusher.py) also honor the `target_ratio` kwarg when provided.

### Can I use different compression ratios for different message types?

Yes. Since `target_ratio` is passed as a runtime kwarg through the ContentRouter, you can invoke the `compress()` function multiple times with different ratios for assistant messages versus tool outputs, or use the batch API with a list of ratios to specify per-text retention rates.

### How does target_ratio interact with the model's internal scoring?

When you provide `target_ratio`, it completely overrides the model's internal `score_threshold` logic. The `KompressCompressor` calculates the exact number of tokens to keep based on your ratio and performs a deterministic top-k selection by token scores, ignoring any default thresholding behavior.