memory_recall vs memory_smart_search in AgentMemory: Retrieval Strategy Comparison

memory_recall performs pure BM25 keyword searches for exact matches while memory_smart_search executes hybrid retrieval combining vector embeddings, graph entities, and BM25 scoring for semantic discovery.

AgentMemory exposes two distinct MCP tools for retrieving stored observations and memories. While both memory_recall and memory_smart_search query the underlying memory store, they implement fundamentally different retrieval strategies according to the rohitg00/agentmemory source code. Understanding these differences ensures you select the right tool for precise lookups versus exploratory semantic searches.

What is memory_recall?

memory_recall is the keyword-only retrieval tool that wraps the mem::search function. It relies exclusively on BM25 (Best Match 25) term-frequency scoring to scan observations and memories through a classic inverted index.

In src/functions/search.ts, the registerSearchFunction implements this as a pure text search with no semantic understanding. The tool accepts a query string and returns results in three optional formats: full (complete observation payload), compact (id plus metadata), or narrative (text-only content). A distinctive feature is the token_budget parameter, which allows automatic truncation of results to fit within specific token limits—critical for context window management in LLM applications.

memory_smart_search exposes the mem::smart-search function defined in src/functions/smart-search.ts, implementing a multi-modal hybrid search pipeline. Rather than relying solely on keywords, this tool orchestrates three concurrent retrieval streams:

The HybridSearch implementation in src/state/hybrid-search.ts merges these streams using RRF (Reciprocal Rank Fusion), then optionally applies LLM reranking when rerankEnabled is active. Additionally, memory_smart_search supports query expansion through searchWithExpansion, which reformulates queries and adds temporal concretizations before searching.

Key Technical Differences

Search Algorithm

  • memory_recall: Pure BM25 scoring based on term frequency and inverse document frequency. The query is used as-is with no expansion or semantic interpretation.

  • memory_smart_search: Hybrid scoring via RRF that combines BM25 ranks, vector similarity scores, and graph relationship weights. Supports optional LLM-based reranking for result refinement.

Data Sources

Result Formats

  • memory_recall: Configurable output via format parameter—choose between full, compact, or narrative representations.

  • memory_smart_search: Always returns compact metadata (obsId, sessionId, title, type, score, timestamp). Full payloads require explicit expandIds parameters to fetch specific observations.

Token Management

  • memory_recall: Supports token_budget parameter for automatic result truncation to fit context windows.

  • memory_smart_search: No token budget trimming—results are strictly limited by the limit parameter (default 10).

When to Use Each Tool

Use memory_recall when you need:

  • Exact file name lookups ("find process_data.py")
  • Specific metadata retrieval ("show last decision on authentication feature")
  • Fast queries without embedding model overhead
  • Results guaranteed to contain specific keywords

Use memory_smart_search when you need:

  • Conceptual discovery ("how did we solve login issues?")
  • Related content without exact keyword matches
  • Exploratory research across semantically similar topics
  • Multi-faceted results combining code, documentation, and entity relationships

Implementation Details and Source Files

The tool definitions are registered in src/mcp/tools-registry.ts:

  • memory_recall appears in CORE_TOOLS at lines 13-16
  • memory_smart_search appears in V040_TOOLS at lines 114-116

Core implementations reside in:

Practical Code Examples

curl -X POST http://localhost:3000/agentmemory/search \
  -H "Content-Type: application/json" \
  -d '{
        "query": "process_data.py file modified",
        "limit": 5,
        "format": "compact"
      }'

MCP tool invocation:

{
  "tool": "memory_recall",
  "args": {
    "query": "process_data.py file modified",
    "limit": 5,
    "format": "compact"
  }
}

Using memory_smart_search for Semantic Discovery

curl -X POST http://localhost:3000/agentmemory/smart-search \
  -H "Content-Type: application/json" \
  -d '{
        "query": "how did we fix the login bug",
        "limit": 8
      }'

MCP tool invocation:

{
  "tool": "memory_smart_search",
  "args": {
    "query": "how did we fix the login bug",
    "limit": 8
  }
}

Summary

  • memory_recall provides fast, deterministic BM25 keyword search with optional token budget management and flexible output formats.
  • memory_smart_search delivers semantic discovery through hybrid retrieval (BM25 + vector + graph) with RRF scoring and optional query expansion.
  • memory_recall requires no external services; memory_smart_search needs configured embedding providers for full functionality.
  • memory_recall is optimal for exact matches; memory_smart_search excels at finding conceptually related content.

Frequently Asked Questions

Can I use memory_smart_search without an embedding provider?

Yes, but with degraded functionality. According to src/functions/smart-search.ts, the hybrid pipeline falls back to BM25-only retrieval when vector embeddings are unavailable, though you lose the semantic similarity component and RRF benefits. For full hybrid capabilities, configure an embedding provider in src/state/vector-index.ts.

Why does memory_recall support token_budget but memory_smart_search does not?

The token_budget parameter in src/functions/search.ts provides explicit context window management for keyword results, which tend to be voluminous. The memory_smart_search implementation prioritizes result relevance over volume, limiting output strictly via the limit parameter and assuming downstream context management happens after the initial compact result retrieval.

Unlike memory_recall, which offers a format parameter, memory_smart_search always returns compact metadata initially. To retrieve full payloads, pass the expandIds parameter containing specific observation IDs you want expanded, or make secondary calls to fetch complete observation data after identifying relevant results from the compact list.

Which tool performs better for large codebases?

memory_recall offers lower latency for large codebases since it avoids embedding computation and graph traversal. However, memory_smart_search provides superior recall for complex queries where keywords might not match exactly but concepts remain similar. For production systems, use memory_recall for targeted lookups and memory_smart_search for investigative debugging sessions.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →