# How agentmemory Hybrid Search Combines BM25, Vector Embeddings, and Knowledge Graphs

> Discover how agentmemory's hybrid search unifies BM25, vector embeddings, and knowledge graphs using RRF. Learn how it dynamically adjusts weights for optimal performance, even with missing indices.

- Repository: [Rohit Ghumare/agentmemory](https://github.com/rohitg00/agentmemory)
- Tags: deep-dive
- Published: 2026-05-10

---

**TLDR:** Agentmemory's hybrid search orchestrates BM25 lexical matching, dense vector similarity, and knowledge graph traversal through Reciprocal Rank Fusion (RRF) in [`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts), dynamically adjusting signal weights for graceful degradation when indices are unavailable.

The agentmemory repository provides a TypeScript-based memory system for AI agents that unifies multiple retrieval paradigms into a single pipeline. By combining traditional lexical scoring with semantic embeddings and relational graph data, the hybrid search implementation delivers comprehensive results that capture both keyword relevance and conceptual relationships.

## The Three Retrieval Signals in agentmemory Hybrid Search

### BM25 Lexical Matching

The **SearchIndex** class in [`src/state/search-index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/search-index.ts) provides fast lexical retrieval using BM25 (BM2S) scoring. When a query arrives, the system executes `this.bm25.search(query, limit * 2)` to retrieve candidate observation IDs based on term frequency and inverse document frequency.

### Dense Vector Similarity

Semantic matching is handled by the **VectorIndex** class ([`src/state/vector-index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/vector-index.ts)) paired with an **EmbeddingProvider**. The hybrid search pipeline embeds queries using `embeddingProvider.embed()` and performs cosine-similarity searches via `this.vector.search()` to find conceptually similar content beyond exact keyword matches.

### Knowledge Graph Retrieval

Entity-centric retrieval runs through the **GraphRetrieval** module in [`src/functions/graph-retrieval.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/functions/graph-retrieval.ts). The system first extracts entities from queries using `extractEntitiesFromQuery()`, then retrieves relevant observations via `GraphRetrieval.searchByEntities()` based on graph connectivity and entity relationships.

## How the Hybrid Search Pipeline Works

The `HybridSearch` class in [`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts) orchestrates a seven-step retrieval process that fuses these three signals into unified result rankings.

### Parallel Retrieval Across Indices

The pipeline initiates three concurrent retrieval operations:

- **BM25 scoring** retrieves observation IDs with lexical relevance scores
- **Vector scoring** computes embedding similarities when a vector index exists
- **Graph retrieval** fetches entity-linked observations based on extracted query entities

### Graph Expansion from Vector Neighborhoods

The system enhances graph coverage by expanding from semantic results. The top five vector results feed into `graphRetrieval.expandFromChunks()`, discovering additional graph hits that neighbor the high-similarity vector observations, bridging semantic and relational relevance.

### Reciprocal Rank Fusion Scoring

All retrieval signals merge through **Reciprocal Rank Fusion (RRF)**. Each result's rank per index generates a score calculated as `1 / (RRF_K + rank)`. The final **combined score** normalizes signal weights so they sum to 1, allowing configurable balancing between lexical precision, semantic similarity, and relational context.

### Diversification and Enrichment

Before final output, results undergo `diversifyBySession()` to prevent single-session dominance, then `enrichResults()` fetches full observation data from the KV store. If `RERANK_ENABLED` is active, the system passes top-N results through an LLM-based `rerank()` function for final relevance refinement.

## Dynamic Signal Weighting and Graceful Degradation

Agentmemory hybrid search implements **dynamic weight allocation** that ensures robustness across different index states. When the vector index is empty, its weight automatically collapses to 0, triggering fallback to pure BM25 retrieval as verified in [`test/hybrid-search.test.ts`](https://github.com/rohitg00/agentmemory/blob/main/test/hybrid-search.test.ts). Similarly, absent graph results receive zero weight, ensuring the pipeline continues functioning with available signals while maximizing result quality when all indices are populated.

## Implementation Example

```typescript
import { SearchIndex } from "./src/state/search-index.js";
import { VectorIndex } from "./src/state/vector-index.js";
import { HybridSearch } from "./src/state/hybrid-search.js";
import type { EmbeddingProvider } from "./src/types.js";

// 1️⃣ Build the BM25 index
const bm25 = new SearchIndex();
// …add observations (obs) via bm25.add(obs)

// 2️⃣ Build the vector index (optional)
const vector = new VectorIndex();
// Example embedding provider that returns random vectors
const mockProvider: EmbeddingProvider = {
  embed: async (text: string) => {
    const dim = 384;
    return Float32Array.from({ length: dim }, () => Math.random());
  },
};
// Add vectors for each observation
// vector.add(obs.id, obs.sessionId, await mockProvider.embed(obs.title));

// 3️⃣ Create the hybrid searcher
const kv = /* your StateKV implementation */;
const hybrid = new HybridSearch(bm25, vector, mockProvider, kv);

// 4️⃣ Perform a search
const results = await hybrid.search("auth middleware");

// 5️⃣ Inspect results
for (const r of results) {
  console.log(
    `Obs ${r.observation.id} – BM25:${r.bm25Score.toFixed(2)} ` +
    `Vector:${r.vectorScore.toFixed(2)} Graph:${r.graphScore.toFixed(2)} ` +
    `Combined:${r.combinedScore.toFixed(2)}`,
  );
}

```

## Key Source Files and Architecture

- **[`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts)** – Core orchestrator implementing RRF aggregation and pipeline coordination
- **[`src/state/search-index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/search-index.ts)** – BM25 (BM2S) lexical indexing and scoring implementation
- **[`src/state/vector-index.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/vector-index.ts)** – Dense embedding storage and cosine-similarity search
- **[`src/functions/graph-retrieval.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/functions/graph-retrieval.ts)** – Entity-based graph queries and neighborhood expansion
- **[`src/functions/query-expansion.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/functions/query-expansion.ts)** – Query entity extraction for graph retrieval
- **[`test/hybrid-search.test.ts`](https://github.com/rohitg00/agentmemory/blob/main/test/hybrid-search.test.ts)** – Unit tests validating signal combination and fallback behavior

## Summary

- Agentmemory hybrid search unifies BM25 lexical matching, vector embeddings, and knowledge graph traversal in [`src/state/hybrid-search.ts`](https://github.com/rohitg00/agentmemory/blob/main/src/state/hybrid-search.ts)
- **Reciprocal Rank Fusion (RRF)** combines retrieval signals using rank-based scoring with configurable weights that normalize to 1
- **Dynamic weighting** enables graceful degradation to BM25-only or dual-mode retrieval when vector or graph indices are unavailable
- **Graph expansion** from top vector results bridges semantic similarity and relational entity data
- The pipeline includes session diversification, KV store enrichment via `enrichResults()`, and optional LLM-based re-ranking

## Frequently Asked Questions

### How does agentmemory handle missing vector or graph indices?

When vector or graph indices are empty or unavailable, the hybrid search automatically assigns zero weight to those signals and recalculates the normalization so remaining signals sum to 1. This allows the system to fall back to pure BM25 retrieval without code changes, as demonstrated in the test suite's "returns BM25-only results when no vector index is provided" case.

### What is Reciprocal Rank Fusion (RRF) and why does agentmemory use it?

RRF is a rank aggregation method that converts each result's position in individual search results (BM25, vector, graph) into a score using the formula `1 / (RRF_K + rank)`, then sums these scores across indices. Agentmemory uses RRF because it requires no score calibration between heterogeneous indices, handles missing results gracefully, and allows configurable weighting while avoiding complex normalization of disparate score types.

### How does the graph expansion step improve search results?

After identifying the top five vector results, the system calls `graphRetrieval.expandFromChunks()` to traverse the knowledge graph from those semantically similar entries. This discovers related observations that share entities or relationships with the vector hits but might not match the original query lexically or semantically, capturing contextual relationships that pure embedding similarity or keyword matching would miss.

### Can I disable specific search signals in agentmemory?

While the implementation dynamically collapses weights for empty indices to zero, explicit signal disabling occurs by omitting the relevant index during `HybridSearch` instantiation. For example, passing `null` or an empty `VectorIndex` effectively disables vector search, forcing the RRF calculation to rely solely on BM25 and graph signals, or BM25 alone if the graph also yields no results.