How Cross-Agent Memory Works with SharedContext in Headroom
SharedContext enables multiple agents in the Headroom framework to share large data payloads through a thread-safe, compressed in-memory store that reduces token usage by approximately 80% while preserving original content for on-demand retrieval.
The chopratejas/headroom repository implements cross-agent memory through a centralized SharedContext class that eliminates the need to repeatedly transmit full data payloads between agents. By leveraging Headroom’s existing compression pipeline, this mechanism allows agents operating within the same process to exchange efficiently compressed representations of research results, tool outputs, and conversation history while maintaining access to the original uncompressed data.
Compressed Storage Architecture
When an agent generates large outputs—such as research results, tool dumps, or multi-turn conversations—it stores them using SharedContext.put(key, content). This method forwards the raw text to Headroom’s single-function compression API (headroom.compress.compress).
The compression pipeline executes the same transforms used by the proxy for request compression: CacheAligner → ContentRouter → SmartCrusher / CodeCompressor / Kompress. As implemented in headroom/shared_context.py, the process generates a ContextEntry dataclass containing the original text, its compressed form, token counts before and after compression, and the list of applied transforms.
Thread-Safe, Process-Shared Storage
All agents importing the same SharedContext instance access a unified in-memory store protected by threading.Lock. The internal _entries dictionary ensures that context can be shared across threads and across agent hops within the same process without race conditions.
The implementation enforces TTL (time-to-live) and entry limits to prevent unbounded growth. Each entry receives a timestamp, and the get and keys methods filter out entries exceeding the configured ttl (defaulting to 1 hour). When the maximum entry count (max_entries) is reached, the oldest entry is evicted automatically, as defined in headroom/shared_context.py.
Retrieving Context Across Agents
Accessing Compressed vs. Full Content
The SharedContext.get(key, full=False) method returns the compressed version by default, which downstream agents typically consume. This compressed payload is approximately 80% smaller than the original, making it ideal for efficient transmission between agents.
When an agent requires detailed inspection of specific data points, setting full=True returns the original uncompressed text. This "zoom in" capability allows agents to balance efficiency with precision when working with shared memory.
Monitoring Compression Efficiency
The SharedContext.stats() method aggregates token savings across all active entries, exposing:
- Number of active entries
- Total original and compressed token counts
- Total tokens saved
- Overall savings percentage
This introspection capability, implemented in headroom/shared_context.py, makes it easy to monitor the memory-compression efficiency of cross-agent workflows in real-time.
Practical Implementation Examples
Basic Storage and Retrieval
from headroom import SharedContext
# Create a shared context (usually a singleton per process)
ctx = SharedContext()
# Agent A stores a large research result
large_output = """... a very long markdown or JSON ..."""
entry = ctx.put("research", large_output, agent="agent_A")
print(f"Saved {entry.savings_percent}% tokens")
# Agent B later obtains the compressed version
compressed = ctx.get("research")
print("Compressed payload:", compressed[:200], "...")
# Agent B wants the full text for a deep dive
full_text = ctx.get("research", full=True)
print("Full text length:", len(full_text))
# Inspect statistics
stats = ctx.stats()
print(f"Overall savings: {stats.savings_percent}% across {stats.entries} entries")
Integration with Proxy Handlers
from headroom.shared_context import SharedContext
from headroom.proxy.handlers.openai import OpenAIHandler
shared_ctx = SharedContext()
class MyOpenAIHandler(OpenAIHandler):
async def handle(self, request):
# Before sending request, store the tool output
tool_output = request["messages"][-1]["content"]
shared_ctx.put("latest_tool_output", tool_output, agent="openai_handler")
# Retrieve compressed context for the next turn
compressed = shared_ctx.get("latest_tool_output")
request["messages"].append({"role": "assistant", "content": compressed})
return await super().handle(request)
Summary
SharedContextprovides thread-safe, compressed storage for cross-agent memory in Headroom, located inheadroom/shared_context.py- The
put()method automatically compresses content using the pipeline defined inheadroom/compress.py, reducing token counts by approximately 80% - Entries default to a 1-hour TTL with automatic eviction when
max_entriesis exceeded, preventing memory exhaustion - Agents retrieve compressed payloads by default via
get(), withfull=Trueproviding access to the original uncompressed text - The
stats()method tracks aggregate token savings across all shared entries for monitoring compression efficiency
Frequently Asked Questions
What is SharedContext in Headroom?
SharedContext is a centralized memory store that enables cross-agent communication within the Headroom framework. It allows multiple agents running in the same process to share large data payloads through compressed representations, reducing token transmission costs while maintaining thread safety via threading.Lock.
How does the compression pipeline in SharedContext work?
When storing data via put(), SharedContext invokes headroom.compress.compress, which executes a multi-stage pipeline: CacheAligner → ContentRouter → SmartCrusher / CodeCompressor / Kompress. This pipeline removes redundant tokens and optimizes the payload structure, typically achieving approximately 80% size reduction while preserving semantic content.
How long does data persist in SharedContext?
By default, entries expire after 1 hour (configurable via the ttl parameter). Additionally, when the store reaches max_entries, the oldest entry is automatically evicted. The get and keys methods automatically filter out expired entries, ensuring agents only access valid context.
Is SharedContext safe for concurrent multi-agent access?
Yes. According to the source code in headroom/shared_context.py, the internal _entries dictionary is protected by threading.Lock, making all read and write operations thread-safe. This allows multiple agents to simultaneously access and update shared context without race conditions or data corruption.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →