# What Is the Feedback Loop Between CCR Retrieval and TOIN Learning in Headroom?

> Understand the feedback loop between CCR retrieval and TOIN learning in Headroom. Discover how retrieval events optimize future content compression for better performance.

- Repository: [Tejas Chopra/headroom](https://github.com/chopratejas/headroom)
- Tags: internals
- Published: 2026-06-08

---

**The feedback loop between CCR retrieval and TOIN learning in Headroom works by routing every `headroom_retrieve` restoration event through `Telemetry.record_retrieval` into `TOIN.record_retrieval`, which raises the learned importance scores for the retrieved patterns so that the Intelligent Context compressor preserves similar content in future runs.**

The `chopratejas/headroom` repository implements an adaptive context-compression system where the *Compress‑Cache‑Retrieve* (CCR) layer and the **TOIN** learning subsystem are coupled through a closed feedback cycle. This feedback loop between CCR retrieval and TOIN learning is what makes Headroom’s aggressive compression reversible and self-improving over time. Understanding this cycle is essential for tuning the `Intelligent Context` compressor and ensuring that only truly low-value content is ever dropped.

## How the CCR Retrieval and TOIN Learning Feedback Loop Works

Headroom’s CCR layer makes aggressive compression reversible by storing original payloads in a cache and inserting a tiny retrieval marker into the LLM prompt when a message is dropped. When the LLM later calls the special `headroom_retrieve` tool—or the HTTP endpoint used by the CCR proxy—the resulting retrieval event propagates through the telemetry stack and updates TOIN’s learned model.

### Step 1: Dropping a Message and Recording the Compression

When the **Intelligent Context** compressor removes a low-importance message, the original content is saved to the CCR store and a marker (for example, `<CCR:…>`) is inserted in its place. This drop is recorded in TOIN telemetry through `record_compression` so the system knows the message was compressed.

### Step 2: Retrieving the Original via headroom_retrieve

If the LLM decides it needs the full content, it issues a **`headroom_retrieve`** tool call containing the marker’s hash. The CCR response handler in [`headroom/ccr/mcp_server.py`](https://github.com/chopratejas/headroom/blob/main/headroom/ccr/mcp_server.py) looks up the original text in the cache and returns it to the model.

### Step 3: Updating TOIN Metrics on Retrieval

The retrieval handler invokes **`Telemetry.record_retrieval`**, which in turn calls **`TOIN.record_retrieval`** as implemented in [`headroom/telemetry/toin.py`](https://github.com/chopratejas/headroom/blob/main/headroom/telemetry/toin.py). According to the source code, this method increments per-pattern counters including `total_retrievals`, `full_retrievals`, and `field_retrieval_frequency`, and it updates the **retrieval-rate** metric used to compute importance scores. These updated scores are then persisted to the TOIN JSON file located at `HEADROOM_TOIN_PATH`.

### Step 4: Closing the Loop With Adaptive Compression

Future compression passes consult the TOIN scores—specifically `toin_importance` and `error_indicator`. Patterns that caused many retrievals receive a higher importance score, instructing the compressor to **preserve** them next time. As documented in [`wiki/transforms.md`](https://github.com/chopratejas/headroom/blob/main/wiki/transforms.md) under the “TOIN + CCR Integration” section and in [`wiki/ccr.md`](https://github.com/chopratejas/headroom/blob/main/wiki/ccr.md) under “TOIN integration,” this retrieval signal is the only way TOIN learns that the LLM needed the original content, making the compression strategy adaptive.

## Code Example: Triggering the CCR-TOIN Feedback Loop

Below is a minimal, runnable example that demonstrates the full loop in practice.

```python
from headroom import Headroom, Config

# 1️⃣ Create a Headroom client with CCR and TOIN enabled

cfg = Config(
    ccr_enabled=True,                 # store dropped messages

    toin_integration=True,            # record drops & retrievals

)
hr = Headroom(cfg)

# 2️⃣ Run a prompt that will cause a drop (short context budget)

response = hr.run(
    "Explain the detailed workflow of CCR retrieval and TOIN learning."
)

# The LLM may decide it needs the dropped explanation, so it emits:

#   {"tool": "headroom_retrieve", "arguments": {"hash": "<some-hash>"}}

# The client automatically calls the CCR server, gets the original text,

# and records the retrieval in TOIN.

# 3️⃣ Inspect TOIN state after the run

print("TOIN retrieval rate:", hr.toin.retrieval_rate())
print("Top learned patterns:", hr.toin.most_important_patterns(limit=3))

```

After the run, `hr.toin.retrieval_rate()` will be greater than zero because a CCR retrieval occurred. The `most_important_patterns` list will show the retrieved pattern with a higher **importance** score, confirming that the feedback loop altered TOIN’s internal model.

## Key Files That Implement the Feedback Loop

Several source files work together to close the CCR↔TOIN feedback loop:

- **[`headroom/ccr/mcp_server.py`](https://github.com/chopratejas/headroom/blob/main/headroom/ccr/mcp_server.py)** – The `record_retrieval` method handles the HTTP-based CCR retrieval call and forwards the event to telemetry.
- **[`headroom/telemetry/toin.py`](https://github.com/chopratejas/headroom/blob/main/headroom/telemetry/toin.py)** – The `record_retrieval` method updates TOIN metrics and per-pattern counters.
- **[`headroom/cache/compression_store.py`](https://github.com/chopratejas/headroom/blob/main/headroom/cache/compression_store.py)** – Invokes `Telemetry.record_retrieval` when a CCR marker is resolved.
- **[`wiki/transforms.md`](https://github.com/chopratejas/headroom/blob/main/wiki/transforms.md)** (TOIN + CCR Integration section) – Documents that drops are recorded to TOIN and retrievals feed back to TOIN so the system learns which compressions need expansion.
- **[`wiki/ccr.md`](https://github.com/chopratejas/headroom/blob/main/wiki/ccr.md)** (TOIN integration section) – Details the coupling of CCR retrieval with TOIN learning.
- **[`wiki/ARCHITECTURE.md`](https://github.com/chopratejas/headroom/blob/main/wiki/ARCHITECTURE.md)** – Provides the architecture diagram showing the CCR and TOIN feedback cycle.

## Summary

- **CCR retrieval is the teaching signal:** When the LLM calls `headroom_retrieve`, the event signals that a compressed message was actually needed.
- **TOIN records and rates patterns:** `TOIN.record_retrieval` in [`headroom/telemetry/toin.py`](https://github.com/chopratejas/headroom/blob/main/headroom/telemetry/toin.py) updates counters and the retrieval-rate metric, then stores the scores to disk at `HEADROOM_TOIN_PATH`.
- **Future compression adapts:** The Intelligent Context compressor consults `toin_importance` and related scores, preserving patterns that historically trigger retrievals.
- **The loop is fully automatic:** No manual labeling is required; the feedback loop between CCR retrieval and TOIN learning is closed entirely within Headroom’s runtime.

## Frequently Asked Questions

### How does TOIN know a retrieval happened?

When the CCR server resolves a marker, [`headroom/ccr/mcp_server.py`](https://github.com/chopratejas/headroom/blob/main/headroom/ccr/mcp_server.py) invokes `Telemetry.record_retrieval`, which delegates to `TOIN.record_retrieval` in [`headroom/telemetry/toin.py`](https://github.com/chopratejas/headroom/blob/main/headroom/telemetry/toin.py). This method increments per-pattern counters and updates the retrieval-rate metric used for importance scoring.

### What specific TOIN metrics does a retrieval update?

According to the source in [`headroom/telemetry/toin.py`](https://github.com/chopratejas/headroom/blob/main/headroom/telemetry/toin.py), a retrieval increments counters such as `total_retrievals`, `full_retrievals`, and `field_retrieval_frequency`. It also updates the overall retrieval-rate metric that feeds into future `toin_importance` calculations.

### Can I observe the feedback loop without modifying the source code?

Yes. Instantiate `Headroom` with `ccr_enabled=True` and `toin_integration=True`, run a prompt that forces a context drop, and then inspect `hr.toin.retrieval_rate()` and `hr.toin.most_important_patterns()` to see the updated scores reflect the retrieval event.

### Where is the feedback loop documented in the repository?

The loop is described in the [`wiki/ARCHITECTURE.md`](https://github.com/chopratejas/headroom/blob/main/wiki/ARCHITECTURE.md) diagram and in two wiki sections: [`wiki/transforms.md`](https://github.com/chopratejas/headroom/blob/main/wiki/transforms.md) under “TOIN + CCR Integration” and [`wiki/ccr.md`](https://github.com/chopratejas/headroom/blob/main/wiki/ccr.md) under “TOIN integration.” These sections explain how drops are recorded and how retrievals feed back into TOIN learning.