# How AgentMemory Enables Knowledge Graph Extraction from Raw Observations

> Discover how AgentMemory extracts knowledge graphs from raw observations. It embeds a full-text pipeline for LLM-powered entity and relation extraction with hybrid semantic retrieval.

- Repository: [Rohit Ghumare/agentmemory](https://github.com/rohitg00/agentmemory)
- Tags: how-to-guide
- Published: 2026-05-10

---

**AgentMemory embeds a full‑text knowledge graph extraction pipeline that compresses coding observations, extracts entities and relations via LLM, and stores them in dedicated KV stores for hybrid semantic retrieval.**

The `rohitg00/agentmemory` repository implements a complete knowledge graph extraction system that transforms ephemeral coding session observations into persistent, traversable graph structures. By integrating observation compression, LLM‑based entity extraction, and hybrid search capabilities, AgentMemory allows AI agents to retrieve context through both vector similarity and explicit relationship traversal.

## The Three‑Stage Extraction Pipeline

### Stage 1: Observation Compression and Triggering

The pipeline begins when raw coding observations are compressed to preserve semantic entities while reducing token volume. Immediately after compression, the system triggers the `mem::graph-extract` function via the event hook defined in [`src/triggers/events.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/triggers/events.ts). This trigger passes the compressed observation batch to the extraction engine, ensuring that every meaningful session is automatically processed without manual intervention.

### Stage 2: LLM‑Based Entity and Relation Extraction

The core extraction logic resides in [`src/functions/graph.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/functions/graph.ts). This module loads the `GRAPH_EXTRACTION_SYSTEM` prompt from [`src/prompts/graph-extraction.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/prompts/graph-extraction.ts) and sends the compressed observation to the configured LLM. The prompt instructs the model to return a strict JSON schema containing:
- **Entities (nodes)** – semantic objects identified in the text
- **Relationships (edges)** – directional connections between entities

The LLM output is parsed and validated before being persisted. Each extraction event is recorded via `recordAudit`, ensuring full traceability of graph state mutations according to the source code implementation.

### Stage 3: Structured Storage in KV Stores

Extracted graph data is persisted in two dedicated key‑value stores defined in [`src/state/schema.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/schema.ts):
- `KV.graphNodes` maps to the `mem:graph:nodes` namespace
- `KV.graphEdges` maps to the `mem:graph:edges` namespace

This storage design separates node metadata from edge relationships while maintaining referential integrity, allowing O(1) lookups for individual entities and efficient traversal operations.

## Hybrid Retrieval Architecture

### Graph‑Enabled Search Integration

Once persisted, graph entities participate in the hybrid search engine through the `GraphRetrieval` class imported in [`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts). This class provides two primary retrieval mechanisms:
- **`searchByEntities`** – Retrieves memories that share entities with the current query
- **`expandFromChunks`** – Traverses linked chunks to expand result sets via relationship hops

The system calculates a `graphScore` for each candidate memory and merges it with vector search scores using the configurable weight parameter `AGENTMEMORY_GRAPH_WEIGHT` (default `0.3`), defined in [`src/index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/index.ts). This merge produces a final ranked list that balances semantic similarity with topological relevance.

## REST API Endpoints

### Core Graph Operations

AgentMemory exposes graph functionality through REST endpoints registered in [`src/triggers/api.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/triggers/api.ts):
- **`/agentmemory/graph/extract`** – Accepts compressed observations and triggers entity extraction
- **`/agentmemory/graph/query`** – Executes the `mem::graph-query` function for graph‑based retrieval
- **`/agentmemory/graph/stats`** – Returns node and edge count statistics for monitoring

### Practical Usage Examples

Extract a knowledge graph from observations:

```typescript
await fetch("https://your-host/agentmemory/graph/extract", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ observations: compressedObservations }),
});

```

Query the graph for specific entities:

```typescript
const response = await fetch("https://your-host/agentmemory/graph/query", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ query: "class Diagram" })
});
const result = await response.json();
// result.graphNodes and result.graphEdges contain the matching subgraph

```

Inspect graph statistics:

```typescript
const { body } = await fetch("https://your-host/agentmemory/graph/stats");
console.log(await body.json());
// → { nodeCounts: {...}, edgeCounts: {...} }

```

## Observability and Configuration

The system tracks extraction activity through a Prometheus counter named `graphExtraction`, implemented in [`src/telemetry/setup.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/telemetry/setup.ts). This metric monitors pipeline health and throughput in production deployments.

Configuration is controlled via environment variables, with `AGENTMEMORY_GRAPH_WEIGHT` in [`src/index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/index.ts) determining the influence of graph scores in hybrid retrieval (default 0.3, valid range 0.0–1.0).

## Summary

- **Automated Triggering**: The `mem::graph-extract` function in [`src/functions/graph.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/functions/graph.ts) is invoked automatically after observation compression via [`src/triggers/events.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/triggers/events.ts).
- **LLM‑Driven Extraction**: The system prompt `GRAPH_EXTRACTION_SYSTEM` in [`src/prompts/graph-extraction.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/prompts/graph-extraction.ts) forces structured JSON output of entities and relations.
- **Dedicated Storage**: Nodes and edges are stored separately in `mem:graph:nodes` and `mem:graph:edges` KV namespaces defined in [`src/state/schema.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/schema.ts).
- **Hybrid Retrieval**: The `GraphRetrieval` class in [`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts) merges graph traversal scores with vector search using `AGENTMEMORY_GRAPH_WEIGHT`.
- **REST Interface**: Full CRUD‑like operations are available via `/agentmemory/graph/extract`, `/query`, and `/stats` endpoints in [`src/triggers/api.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/triggers/api.ts).

## Frequently Asked Questions

### What triggers knowledge graph extraction in AgentMemory?

Knowledge graph extraction is triggered automatically after the observation compression subsystem processes raw coding session data. The event hook in [`src/triggers/events.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/triggers/events.ts) calls `sdk.triggerVoid` to invoke the `mem::graph-extract` function, ensuring that every compressed observation batch is processed without requiring manual API calls.

### How does AgentMemory store extracted graph entities?

Extracted entities are stored in two dedicated KV store namespaces defined in [`src/state/schema.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/schema.ts): `mem:graph:nodes` for entity metadata and `mem:graph:edges` for relationship data. This separation allows for efficient traversal queries while maintaining a clear schema contract for downstream consumers.

### How does graph extraction integrate with vector search?

AgentMemory implements hybrid search in [`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts) where the `GraphRetrieval` class calculates a `graphScore` based on entity overlap and relationship proximity. This score is merged with vector similarity scores using the `AGENTMEMORY_GRAPH_WEIGHT` parameter (default 0.3) configured in [`src/index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/index.ts), producing results that combine semantic and topological relevance.

### Can I query the knowledge graph independently of memory retrieval?

Yes, the REST endpoint `/agentmemory/graph/query` exposed in [`src/triggers/api.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/triggers/api.ts) allows direct subgraph queries against the stored node and edge collections. You can also inspect graph health and size metrics via the `/agentmemory/graph/stats` endpoint, which returns node and edge counts without requiring a full memory search operation.