# What Agent Memory Architectures Are Taught in AI: 4 Production Patterns Explained

> Discover the 4 agent memory architectures taught in AI including hybrid stores tiered blocks virtual contexts and blackboards implemented in Python for production readiness.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: deep-dive
- Published: 2026-06-10

---

**The AI Engineering from Scratch curriculum teaches four concrete agent memory architectures—Mem0-style hybrid stores, Letta-style tiered blocks, MemGPT-style virtual contexts, and audited shared-memory blackboards—each implemented as reusable Python skills with production-grade features like scope-aware retrieval and sleep-time consolidation.**

The repository `rohitg00/ai-engineering-from-scratch` provides hands-on implementations of these **agent memory architectures** in pure Python, packaged as markdown skill files that developers can import into any agent project. These designs span from single-agent context management to federated multi-agent systems, addressing critical concerns such as temporal invalidation, citation contracts, and provenance tracking.

## Mem0-Style Hybrid Memory Architecture

This architecture implements a **three-store system** that unifies vector embeddings, key-value triples, and graph edges under a single fusion scorer.

### Core Components and Fusion Scoring

The system combines three specialized backends:

- **Vector store** (e.g., Qdrant, pgvector) for similarity search
- **KV store** (e.g., Redis) for exact lookups  
- **Graph store** (e.g., Neo4j) for relational reasoning

Retrieval uses a configurable fusion formula: `score = w_rel * relevance + w_imp * importance + w_rec * recency`. According to the source code in [`phases/14-agent-engineering/09-hybrid-memory-mem0/outputs/skill-hybrid-memory.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/09-hybrid-memory-mem0/outputs/skill-hybrid-memory.md), this allows fine-grained trade-offs between semantic similarity and recency.

### Scope-Aware Retrieval and Temporal Invalidation

The architecture enforces **scope taxonomy** (`user`, `session`, `agent`) to prevent data leakage between users. Rather than deleting records, it implements **temporal invalidation**—contradictory updates receive timestamps and obsolete markers while preserving the original record for auditability.

```python
class HybridMemory:
    def __init__(self, vec_cfg, kv_cfg, graph_cfg, weights):
        self.vec = VectorStore(**vec_cfg)
        self.kv = KVStore(**kv_cfg)
        self.graph = GraphStore(**graph_cfg)
        self.w_rel, self.w_imp, self.w_rec = weights

    def add(self, text, user_id, session_id, scope, importance, tags):
        vec, kv, graph = self._extract(text)
        self.vec.add(vec, metadata={'user': user_id, 'scope': scope, 'tags': tags})
        self.kv.add(**kv, metadata={'user': user_id, 'scope': scope})
        self.graph.add(**graph, metadata={'session': session_id, 'scope': scope})

    def search(self, query, scope):
        vec_res = self.vec.search(query)
        kv_res = self.kv.search(query)
        graph_res = self.graph.search(query)
        
        fused = []
        for rec in set(vec_res + kv_res + graph_res):
            relevance = rec['score']
            importance = rec.get('importance', 0)
            recency = rec.get('recency', 0)
            fused.append((rec, self.w_rel*relevance + self.w_imp*importance + self.w_rec*recency))
        return sorted(fused, key=lambda x: -x[1])

```

## Letta-Style Memory Blocks with Sleep-Time Compute

This **three-tier architecture** separates memory into core blocks (facts, persona, task), a recall store for recent turns, and an archival store for long-term data.

### Three-Tier Layout and Block Versioning

As implemented in [`phases/14-agent-engineering/08-memory-blocks-sleep-time-compute/outputs/skill-memory-blocks.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/08-memory-blocks-sleep-time-compute/outputs/skill-memory-blocks.md), the system uses:

- **Block objects**: Mutable state containers with versioning and near-limit alerts
- **Recall store**: Paginated log of recent turns with capacity-based eviction
- **Archival store**: Long-term persistence using invalidation markers instead of deletions

### Sleep-Time Consolidation

A **background consolidation agent** runs after each user turn, summarizing over-limit blocks and cleaning contradictory entries. This keeps the critical path lean while maintaining long-term knowledge consistency.

```python
class SleepTimeAgent:
    def __init__(self, block_store, archival):
        self.blocks = block_store
        self.archival = archival

    def run(self):
        for block in self.blocks.all():
            if block.near_limit():
                summary = summarize(block.history)
                self.archival.insert(summary, tags=['summary'])
                block.clear_history()
        
        for rec in self.archival.recent_conflicts():
            self.archival.invalidate(rec.id)

```

## MemGPT-Style Virtual Context Management

This design uses a **two-tier system** with strict boundaries between active context and archival storage.

### Bounded MainContext and Citation Contracts

The `MainContext` maintains a FIFO message buffer with auto-eviction when token budgets are exceeded. According to [`phases/14-agent-engineering/07-memory-virtual-context-memgpt/outputs/skill-virtual-memory.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/07-memory-virtual-context-memgpt/outputs/skill-virtual-memory.md), evicted turns remain searchable in the `ArchivalStore`.

The system enforces a **strict citation contract**: every archival hit must include its source ID, ensuring agents reference specific memories in their responses.

### Memory Tools for LLM Access

The architecture exposes functions like `core_memory_append` and `archival_memory_search` as tools the LLM can invoke, giving the model explicit control over read/write operations.

```python
class MainContext:
    def __init__(self, max_tokens):
        self.max_tokens = max_tokens
        self.messages = []
        self.core = {"facts": {}, "persona": {}, "task": {}}
    
    def add_message(self, role, content):
        self.messages.append({"role": role, "content": content})
        if token_len(self.messages) > self.max_tokens:
            self.evict()

    def evict(self):
        evicted = self.messages.pop(0)
        archival.insert(evicted, turn_id=evicted['id'])

class ArchivalStore:
    def __init__(self, backend):
        self.backend = backend
    
    def insert(self, record, **meta):
        return self.backend.insert(record, **meta)
    
    def search(self, query, top_k=5):
        hits = self.backend.search(query, top_k=top_k)
        return [(hit.id, hit.text) for hit in hits]

```

## Shared-Memory Blackboard for Multi-Agent Systems

Designed for **multi-agent swarms**, this pattern uses a common blackboard with scoped projections and comprehensive audit trails.

### Provenance Tracking and Safety Verification

The implementation in [`phases/16-multi-agent-and-swarms/13-shared-memory-blackboard/outputs/skill-memory-auditor.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/16-multi-agent-and-swarms/13-shared-memory-blackboard/outputs/skill-memory-auditor.md) requires:

- **Provenance fields**: writer identity, timestamp, and prompt hash for every entry
- **Append-only logs**: Versioned updates prevent silent mutations  
- **Verifier separation**: A read-only safety agent audits the pool without write access

```python
from memory_auditor import MemoryAuditor

auditor = MemoryAuditor(codebase_path='phases/16-multi-agent-and-swarms/')
report = auditor.run()
print(report.summary())
print(report.provenance())

```

## Source Files and Skill Locations

Each architecture is packaged as a reusable skill markdown file:

| Architecture | Skill File Path | Key Features |
|--------------|----------------|--------------|
| Hybrid Memory | [`phases/14-agent-engineering/09-hybrid-memory-mem0/outputs/skill-hybrid-memory.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/09-hybrid-memory-mem0/outputs/skill-hybrid-memory.md) | Fusion scoring, scope-aware retrieval, temporal invalidation |
| Memory Blocks | [`phases/14-agent-engineering/08-memory-blocks-sleep-time-compute/outputs/skill-memory-blocks.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/08-memory-blocks-sleep-time-compute/outputs/skill-memory-blocks.md) | Sleep-time consolidation, block versioning |
| Virtual Memory | [`phases/14-agent-engineering/07-memory-virtual-context-memgpt/outputs/skill-virtual-memory.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/07-memory-virtual-context-memgpt/outputs/skill-virtual-memory.md) | Citation contracts, memory tools |
| Memory Auditor | [`phases/16-multi-agent-and-swarms/13-shared-memory-blackboard/outputs/skill-memory-auditor.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/16-multi-agent-and-swarms/13-shared-memory-blackboard/outputs/skill-memory-auditor.md) | Provenance tracking, poisoning detection |

## Summary

- **Mem0-style Hybrid Memory** combines vector, KV, and graph stores with weighted fusion scoring and scope-aware retrieval to handle heterogeneous data safely.
- **Letta-style Memory Blocks** use three-tier storage with background sleep-time consolidation to maintain low latency while preserving long-term context.
- **MemGPT-style Virtual Context** enforces strict boundaries between active FIFO buffers and searchable archives, requiring citation of all retrieved memories.
- **Shared-Memory Blackboard** provides audit trails and provenance tracking for multi-agent systems, using read-only verifiers to detect poisoning attacks.

## Frequently Asked Questions

### What is the difference between Mem0 hybrid memory and MemGPT virtual context?

**Mem0 hybrid memory** uses three simultaneous stores (vector, KV, graph) blended via a fusion scorer that weights relevance, importance, and recency, making it ideal for heterogeneous data relationships. **MemGPT virtual context** strictly separates active conversation history (FIFO) from archival storage and requires explicit citations for any retrieved memory, optimizing for traceability in long-running conversations.

### How does Letta-style sleep-time consolidation improve agent performance?

**Sleep-time consolidation** moves memory maintenance—such as summarizing full blocks and invalidating contradictions—off the critical request path, reducing latency for user-facing turns. According to the `ai-engineering-from-scratch` implementation, this background process runs automatically after each turn while the user waits, rather than blocking the main response generation.

### Why do multi-agent systems need a shared-memory blackboard auditor?

The **shared-memory blackboard** pattern prevents data poisoning and silent mutations by enforcing append-only writes, provenance metadata (writer, timestamp, prompt hash), and strict separation between writer agents and read-only verifier agents. The auditor skill scans codebases to verify these safety properties before production deployment.

### Can these agent memory architectures be combined in a single project?

Yes. The curriculum presents these architectures as modular **skills** that can be imported individually or composed—many production agents use Mem0-style retrieval for facts, Letta-style blocks for persona management, and MemGPT-style citations for conversation history, all within the same codebase.