How to Configure CCR Retrieval for Your Specific Use Case

Configure CCR retrieval by setting the ccr_enabled, ccr_inject_tool, ccr_handle_responses, and ccr_context_tracking flags in CCRConfig and ProxyConfig to control whether the headroom_retrieve tool is injected and how the proxy resolves retrieval calls.

The headroom library implements a CCR (Compress-Cache-Retrieve) layer that enables lossless compression by storing original data in a local cache and exposing a headroom_retrieve tool. When you configure CCR retrieval for your specific use case, you determine exactly how the proxy handles dropped content, whether the LLM can request originals automatically, and how retrieval calls are resolved. These configurations are defined in headroom/config.py and applied through the ProxyConfig and CCRConfig classes.

Core CCR Configuration Options

The CCR system exposes four primary configuration flags that control its behavior:

  • ccr_enabled — Defined in the CCRConfig class in headroom/config.py (line 388), this boolean activates the CCR store. When enabled, the compressor writes a ccr_hash that can be used later to retrieve the original payload.

  • ccr_inject_tool — Set in ProxyConfig, this boolean controls whether the proxy automatically adds the headroom_retrieve tool to the LLM's tool list. Keep this enabled if you want the LLM to decide when to fetch originals; disable it for manual retrieval workflows.

  • ccr_handle_responses — Also in ProxyConfig, this boolean determines whether the proxy automatically resolves headroom_retrieve calls. Disable this only when you need to intercept calls yourself for logging, transformation, or custom UI handling.

  • ccr_context_tracking — An optional boolean in CCRConfig that enables multi-turn awareness, allowing retrieved content to inform future compression decisions. This is useful for long-running agents that need to remember which dropped items were later expanded.

Configuration Patterns for Common Use Cases

Standard Reversible Compression (Default)

For straightforward reversible compression where the LLM automatically retrieves content when needed, enable CCR with default behavior. This configuration injects the retrieval tool and handles responses automatically.

from headroom import Headroom, CCRConfig, ProxyConfig

# Enable CCR with default behaviour (tool injection + auto-response handling)

ccr_cfg = CCRConfig(enabled=True)
proxy_cfg = ProxyConfig(ccr_config=ccr_cfg)
headroom = Headroom(proxy_config=proxy_cfg)

# The LLM may call headroom_retrieve when it needs full content

compressed = headroom.compress(messages)

Custom Retrieval Interface

When building a custom UI or dashboard where you control retrieval manually, disable automatic tool injection. The compressor still generates the hash, but you retrieve the original using the hash directly.

from headroom import Headroom, CCRConfig, ProxyConfig

# Turn on CCR but do not inject the tool

ccr_cfg = CCRConfig(enabled=True, inject_retrieval_marker=False)
proxy_cfg = ProxyConfig(ccr_config=ccr_cfg, ccr_inject_tool=False)
headroom = Headroom(proxy_config=proxy_cfg)

result = headroom.compress(messages)

# Use the hash for manual retrieval

original_json = headroom.ccr_get(result.ccr_hash)

Intercepted Response Handling

For workflows requiring fine-grained control over every retrieval call, enable the tool but disable automatic response handling. This allows you to parse and process headroom_retrieve calls programmatically.

proxy_cfg = ProxyConfig(
    ccr_config=CCRConfig(enabled=True),
    ccr_inject_tool=True,        # tool is still injected

    ccr_handle_responses=False,  # proxy will not resolve calls automatically

)
headroom = Headroom(proxy_config=proxy_cfg)

# Process tool calls manually

response = headroom.run_some_interaction()
for tool_use in response.tool_calls:
    if tool_use["name"] == "headroom_retrieve":
        payload = headroom.ccr_get(tool_use["arguments"]["hash"])
        # Process payload as needed

Multi-Turn Context Tracking

For agents running many rounds of compression, enable ccr_context_tracking to maintain awareness of which dropped items were retrieved. This feedback improves future compression decisions.

proxy_cfg = ProxyConfig(
    ccr_config=CCRConfig(enabled=True, context_tracking=True)
)
headroom = Headroom(proxy_config=proxy_cfg)

Where Configuration Is Applied

The configuration system propagates settings through multiple integration points:

  • Python Library — The Headroom class accepts a proxy_config argument containing CCRConfig. The configuration merges into the runtime environment and propagates to the Rust core via the shim layer in headroom/_shim.py.

  • CLI and Proxy — Pass flags through environment variables or command-line options. For example:

headroom proxy --port 8787 \
    --ccr-enabled true \
    --ccr-inject-tool false \
    --ccr-handle-responses false

The CLI parses these flags in headroom/cli.py and builds a ProxyConfig internally.

Practical Implementation Examples

LangChain Integration with Automatic Retrieval

Integrate headroom with LangChain chains to automatically fetch compressed content when the LLM requests it.

from langchain import LLMChain
from headroom import Headroom, CCRConfig, ProxyConfig

# CCR enabled with tool injection and auto-handling

proxy_cfg = ProxyConfig(ccr_config=CCRConfig(enabled=True))
hr = Headroom(proxy_config=proxy_cfg)

chain = LLMChain(
    llm=hr.wrap_langchain_llm(),
    prompt="Summarize the following logs:\n{logs}"
)

# Automatically fetches any compressed logs the LLM asks for

summary = chain.run(logs=big_log_text)

Manual Retrieval for Dashboard Applications

Build a web interface that displays compressed data while allowing users to fetch full content on demand.


# After compression

result = hr.compress(messages)
hash_key = result.ccr_hash

# In your web UI

def fetch_original():
    payload = hr.ccr_get(hash_key)  # Returns raw JSON string

    display(payload)                # Show full context to user

Disabling CCR for Performance Benchmarking

Run performance tests without the overhead of CCR storage and retrieval.

headroom perf --no-ccr

Summary

  • Use CCRConfig in headroom/config.py to toggle CCR storage (ccr_enabled) and multi-turn awareness (ccr_context_tracking).
  • Set ccr_inject_tool and ccr_handle_responses in ProxyConfig to control tool availability and automatic resolution.
  • Access cached originals via headroom.ccr_get() using the hash returned from compress().
  • Apply configurations through Python objects, CLI arguments in headroom/cli.py, or the MCP server interface.

Frequently Asked Questions

What is the difference between ccr_inject_tool and ccr_handle_responses?

ccr_inject_tool controls whether the headroom_retrieve tool appears in the LLM's tool list, while ccr_handle_responses determines whether the proxy automatically resolves the tool call when invoked. Disable ccr_handle_responses when you need to intercept calls for logging, transformation, or integration with systems that cannot perform extra HTTP round-trips.

How do I retrieve original data manually without LLM tool calls?

Set ccr_enabled=True and ccr_inject_tool=False in your configuration. After calling headroom.compress(messages), use the ccr_hash attribute from the result object to call headroom.ccr_get(hash_key), which performs a direct lookup against the CCR store without requiring LLM interaction.

Where is the CCR configuration defined in the source code?

The CCRConfig class and related flags are defined in headroom/config.py at line 388 according to the headroom source code. The proxy applies these settings through headroom/_shim.py to the Rust core, while CLI parsing occurs in headroom/cli.py.

Can I enable CCR for some requests but not others?

Yes. Since the Headroom instance accepts configuration at initialization, you can create separate instances with different CCRConfig settings for different workflows. Alternatively, modify the ccr_enabled flag dynamically if your application logic supports runtime configuration changes, though this requires reinitializing the compression context.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →