What Is the Feedback Loop Between CCR Retrieval and TOIN Learning in Headroom?

The feedback loop between CCR retrieval and TOIN learning in Headroom works by routing every headroom_retrieve restoration event through Telemetry.record_retrieval into TOIN.record_retrieval, which raises the learned importance scores for the retrieved patterns so that the Intelligent Context compressor preserves similar content in future runs.

The chopratejas/headroom repository implements an adaptive context-compression system where the Compress‑Cache‑Retrieve (CCR) layer and the TOIN learning subsystem are coupled through a closed feedback cycle. This feedback loop between CCR retrieval and TOIN learning is what makes Headroom’s aggressive compression reversible and self-improving over time. Understanding this cycle is essential for tuning the Intelligent Context compressor and ensuring that only truly low-value content is ever dropped.

How the CCR Retrieval and TOIN Learning Feedback Loop Works

Headroom’s CCR layer makes aggressive compression reversible by storing original payloads in a cache and inserting a tiny retrieval marker into the LLM prompt when a message is dropped. When the LLM later calls the special headroom_retrieve tool—or the HTTP endpoint used by the CCR proxy—the resulting retrieval event propagates through the telemetry stack and updates TOIN’s learned model.

Step 1: Dropping a Message and Recording the Compression

When the Intelligent Context compressor removes a low-importance message, the original content is saved to the CCR store and a marker (for example, <CCR:…>) is inserted in its place. This drop is recorded in TOIN telemetry through record_compression so the system knows the message was compressed.

Step 2: Retrieving the Original via headroom_retrieve

If the LLM decides it needs the full content, it issues a headroom_retrieve tool call containing the marker’s hash. The CCR response handler in headroom/ccr/mcp_server.py looks up the original text in the cache and returns it to the model.

Step 3: Updating TOIN Metrics on Retrieval

The retrieval handler invokes Telemetry.record_retrieval, which in turn calls TOIN.record_retrieval as implemented in headroom/telemetry/toin.py. According to the source code, this method increments per-pattern counters including total_retrievals, full_retrievals, and field_retrieval_frequency, and it updates the retrieval-rate metric used to compute importance scores. These updated scores are then persisted to the TOIN JSON file located at HEADROOM_TOIN_PATH.

Step 4: Closing the Loop With Adaptive Compression

Future compression passes consult the TOIN scores—specifically toin_importance and error_indicator. Patterns that caused many retrievals receive a higher importance score, instructing the compressor to preserve them next time. As documented in wiki/transforms.md under the “TOIN + CCR Integration” section and in wiki/ccr.md under “TOIN integration,” this retrieval signal is the only way TOIN learns that the LLM needed the original content, making the compression strategy adaptive.

Code Example: Triggering the CCR-TOIN Feedback Loop

Below is a minimal, runnable example that demonstrates the full loop in practice.

from headroom import Headroom, Config

# 1️⃣ Create a Headroom client with CCR and TOIN enabled

cfg = Config(
    ccr_enabled=True,                 # store dropped messages

    toin_integration=True,            # record drops & retrievals

)
hr = Headroom(cfg)

# 2️⃣ Run a prompt that will cause a drop (short context budget)

response = hr.run(
    "Explain the detailed workflow of CCR retrieval and TOIN learning."
)

# The LLM may decide it needs the dropped explanation, so it emits:

#   {"tool": "headroom_retrieve", "arguments": {"hash": "<some-hash>"}}

# The client automatically calls the CCR server, gets the original text,

# and records the retrieval in TOIN.

# 3️⃣ Inspect TOIN state after the run

print("TOIN retrieval rate:", hr.toin.retrieval_rate())
print("Top learned patterns:", hr.toin.most_important_patterns(limit=3))

After the run, hr.toin.retrieval_rate() will be greater than zero because a CCR retrieval occurred. The most_important_patterns list will show the retrieved pattern with a higher importance score, confirming that the feedback loop altered TOIN’s internal model.

Key Files That Implement the Feedback Loop

Several source files work together to close the CCR↔TOIN feedback loop:

  • headroom/ccr/mcp_server.py – The record_retrieval method handles the HTTP-based CCR retrieval call and forwards the event to telemetry.
  • headroom/telemetry/toin.py – The record_retrieval method updates TOIN metrics and per-pattern counters.
  • headroom/cache/compression_store.py – Invokes Telemetry.record_retrieval when a CCR marker is resolved.
  • wiki/transforms.md (TOIN + CCR Integration section) – Documents that drops are recorded to TOIN and retrievals feed back to TOIN so the system learns which compressions need expansion.
  • wiki/ccr.md (TOIN integration section) – Details the coupling of CCR retrieval with TOIN learning.
  • wiki/ARCHITECTURE.md – Provides the architecture diagram showing the CCR and TOIN feedback cycle.

Summary

  • CCR retrieval is the teaching signal: When the LLM calls headroom_retrieve, the event signals that a compressed message was actually needed.
  • TOIN records and rates patterns: TOIN.record_retrieval in headroom/telemetry/toin.py updates counters and the retrieval-rate metric, then stores the scores to disk at HEADROOM_TOIN_PATH.
  • Future compression adapts: The Intelligent Context compressor consults toin_importance and related scores, preserving patterns that historically trigger retrievals.
  • The loop is fully automatic: No manual labeling is required; the feedback loop between CCR retrieval and TOIN learning is closed entirely within Headroom’s runtime.

Frequently Asked Questions

How does TOIN know a retrieval happened?

When the CCR server resolves a marker, headroom/ccr/mcp_server.py invokes Telemetry.record_retrieval, which delegates to TOIN.record_retrieval in headroom/telemetry/toin.py. This method increments per-pattern counters and updates the retrieval-rate metric used for importance scoring.

What specific TOIN metrics does a retrieval update?

According to the source in headroom/telemetry/toin.py, a retrieval increments counters such as total_retrievals, full_retrievals, and field_retrieval_frequency. It also updates the overall retrieval-rate metric that feeds into future toin_importance calculations.

Can I observe the feedback loop without modifying the source code?

Yes. Instantiate Headroom with ccr_enabled=True and toin_integration=True, run a prompt that forces a context drop, and then inspect hr.toin.retrieval_rate() and hr.toin.most_important_patterns() to see the updated scores reflect the retrieval event.

Where is the feedback loop documented in the repository?

The loop is described in the wiki/ARCHITECTURE.md diagram and in two wiki sections: wiki/transforms.md under “TOIN + CCR Integration” and wiki/ccr.md under “TOIN integration.” These sections explain how drops are recorded and how retrievals feed back into TOIN learning.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →