How agentmemory Hybrid Search Combines BM25, Vector Embeddings, and Knowledge Graphs

TLDR: Agentmemory's hybrid search orchestrates BM25 lexical matching, dense vector similarity, and knowledge graph traversal through Reciprocal Rank Fusion (RRF) in src/state/hybrid-search.ts, dynamically adjusting signal weights for graceful degradation when indices are unavailable.

The agentmemory repository provides a TypeScript-based memory system for AI agents that unifies multiple retrieval paradigms into a single pipeline. By combining traditional lexical scoring with semantic embeddings and relational graph data, the hybrid search implementation delivers comprehensive results that capture both keyword relevance and conceptual relationships.

BM25 Lexical Matching

The SearchIndex class in src/state/search-index.ts provides fast lexical retrieval using BM25 (BM2S) scoring. When a query arrives, the system executes this.bm25.search(query, limit * 2) to retrieve candidate observation IDs based on term frequency and inverse document frequency.

Dense Vector Similarity

Semantic matching is handled by the VectorIndex class (src/state/vector-index.ts) paired with an EmbeddingProvider. The hybrid search pipeline embeds queries using embeddingProvider.embed() and performs cosine-similarity searches via this.vector.search() to find conceptually similar content beyond exact keyword matches.

Knowledge Graph Retrieval

Entity-centric retrieval runs through the GraphRetrieval module in src/functions/graph-retrieval.ts. The system first extracts entities from queries using extractEntitiesFromQuery(), then retrieves relevant observations via GraphRetrieval.searchByEntities() based on graph connectivity and entity relationships.

How the Hybrid Search Pipeline Works

The HybridSearch class in src/state/hybrid-search.ts orchestrates a seven-step retrieval process that fuses these three signals into unified result rankings.

Parallel Retrieval Across Indices

The pipeline initiates three concurrent retrieval operations:

  • BM25 scoring retrieves observation IDs with lexical relevance scores
  • Vector scoring computes embedding similarities when a vector index exists
  • Graph retrieval fetches entity-linked observations based on extracted query entities

Graph Expansion from Vector Neighborhoods

The system enhances graph coverage by expanding from semantic results. The top five vector results feed into graphRetrieval.expandFromChunks(), discovering additional graph hits that neighbor the high-similarity vector observations, bridging semantic and relational relevance.

Reciprocal Rank Fusion Scoring

All retrieval signals merge through Reciprocal Rank Fusion (RRF). Each result's rank per index generates a score calculated as 1 / (RRF_K + rank). The final combined score normalizes signal weights so they sum to 1, allowing configurable balancing between lexical precision, semantic similarity, and relational context.

Diversification and Enrichment

Before final output, results undergo diversifyBySession() to prevent single-session dominance, then enrichResults() fetches full observation data from the KV store. If RERANK_ENABLED is active, the system passes top-N results through an LLM-based rerank() function for final relevance refinement.

Dynamic Signal Weighting and Graceful Degradation

Agentmemory hybrid search implements dynamic weight allocation that ensures robustness across different index states. When the vector index is empty, its weight automatically collapses to 0, triggering fallback to pure BM25 retrieval as verified in test/hybrid-search.test.ts. Similarly, absent graph results receive zero weight, ensuring the pipeline continues functioning with available signals while maximizing result quality when all indices are populated.

Implementation Example

import { SearchIndex } from "./src/state/search-index.js";
import { VectorIndex } from "./src/state/vector-index.js";
import { HybridSearch } from "./src/state/hybrid-search.js";
import type { EmbeddingProvider } from "./src/types.js";

// 1️⃣ Build the BM25 index
const bm25 = new SearchIndex();
// …add observations (obs) via bm25.add(obs)

// 2️⃣ Build the vector index (optional)
const vector = new VectorIndex();
// Example embedding provider that returns random vectors
const mockProvider: EmbeddingProvider = {
  embed: async (text: string) => {
    const dim = 384;
    return Float32Array.from({ length: dim }, () => Math.random());
  },
};
// Add vectors for each observation
// vector.add(obs.id, obs.sessionId, await mockProvider.embed(obs.title));

// 3️⃣ Create the hybrid searcher
const kv = /* your StateKV implementation */;
const hybrid = new HybridSearch(bm25, vector, mockProvider, kv);

// 4️⃣ Perform a search
const results = await hybrid.search("auth middleware");

// 5️⃣ Inspect results
for (const r of results) {
  console.log(
    `Obs ${r.observation.id} – BM25:${r.bm25Score.toFixed(2)} ` +
    `Vector:${r.vectorScore.toFixed(2)} Graph:${r.graphScore.toFixed(2)} ` +
    `Combined:${r.combinedScore.toFixed(2)}`,
  );
}

Key Source Files and Architecture

Summary

  • Agentmemory hybrid search unifies BM25 lexical matching, vector embeddings, and knowledge graph traversal in src/state/hybrid-search.ts
  • Reciprocal Rank Fusion (RRF) combines retrieval signals using rank-based scoring with configurable weights that normalize to 1
  • Dynamic weighting enables graceful degradation to BM25-only or dual-mode retrieval when vector or graph indices are unavailable
  • Graph expansion from top vector results bridges semantic similarity and relational entity data
  • The pipeline includes session diversification, KV store enrichment via enrichResults(), and optional LLM-based re-ranking

Frequently Asked Questions

How does agentmemory handle missing vector or graph indices?

When vector or graph indices are empty or unavailable, the hybrid search automatically assigns zero weight to those signals and recalculates the normalization so remaining signals sum to 1. This allows the system to fall back to pure BM25 retrieval without code changes, as demonstrated in the test suite's "returns BM25-only results when no vector index is provided" case.

What is Reciprocal Rank Fusion (RRF) and why does agentmemory use it?

RRF is a rank aggregation method that converts each result's position in individual search results (BM25, vector, graph) into a score using the formula 1 / (RRF_K + rank), then sums these scores across indices. Agentmemory uses RRF because it requires no score calibration between heterogeneous indices, handles missing results gracefully, and allows configurable weighting while avoiding complex normalization of disparate score types.

How does the graph expansion step improve search results?

After identifying the top five vector results, the system calls graphRetrieval.expandFromChunks() to traverse the knowledge graph from those semantically similar entries. This discovers related observations that share entities or relationships with the vector hits but might not match the original query lexically or semantically, capturing contextual relationships that pure embedding similarity or keyword matching would miss.

Can I disable specific search signals in agentmemory?

While the implementation dynamically collapses weights for empty indices to zero, explicit signal disabling occurs by omitting the relevant index during HybridSearch instantiation. For example, passing null or an empty VectorIndex effectively disables vector search, forcing the RRF calculation to rely solely on BM25 and graph signals, or BM25 alone if the graph also yields no results.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →