How Local Search Works in the GraphRAG Agent: Neo4j Vector Search and LLM Integration

Local search in the GraphRAG Agent retrieves context directly from a Neo4j knowledge graph using vector similarity and Cypher queries, then synthesizes answers through a LangChain LLM pipeline.

The GraphRAG Agent's local search component provides precise, graph-grounded answers by combining Neo4j vector indexes with community-weighted Cypher retrieval. As implemented in the 1517005260/graph-rag-agent repository, this module enables agents to fetch relevant chunks, community summaries, and relationships before generating natural language responses.

Local Search Architecture and Core Components

The local search engine operates as a bridge between the Neo4j knowledge graph and large language models. According to the source code in graphrag_agent/search/local_search.py, the implementation follows a retrieval-augmented generation (RAG) pattern optimized for graph structures.

The component combines three critical capabilities:

  • Vector similarity search via LangChain's Neo4jVector to identify relevant text chunks
  • Community-aware graph traversal using weighted Cypher queries to extract relationships and summaries
  • LLM synthesis through prompt templates that combine retrieved context with user queries

Key configuration values—including top_chunks, top_communities, and index names—are loaded from graphrag_agent/config/settings.py during initialization.

Initialization and Configuration

When instantiating the LocalSearch class, the __init__ method (lines 22‑35) establishes the core dependencies required for graph-based retrieval.

def __init__(self, llm, embeddings):
    self.llm = llm
    self.embeddings = embeddings
    self.driver = GraphDatabase.driver(
        settings.NEO4J_URI,
        auth=(settings.NEO4J_USER, settings.NEO4J_PASSWORD)
    )
    self.index_name = settings.LOCAL_SEARCH_SETTINGS["index_name"]
    self.top_chunks = settings.LOCAL_SEARCH_SETTINGS["top_chunks"]
    self.top_communities = settings.LOCAL_SEARCH_SETTINGS["top_communities"]
    self._init_community_weights()

The constructor stores the LLM and embedding model instances, initializes a Neo4j driver using connection parameters from the settings module, and caches runtime parameters that control retrieval depth. Immediately after setup, it triggers community weight pre-processing to optimize subsequent queries.

Community Weight Pre-Processing

Before handling any queries, the _init_community_weights method (lines 60‑68) executes a Cypher query that calculates the significance of each community node based on chunk references.

def _init_community_weights(self):
    query = """
    MATCH (c:__Chunk__)-[:MENTIONS]->(com:__Community__)
    WITH com, count(c) as weight
    SET com.weight = weight
    """
    with self.driver.session() as session:
        session.run(query)

This preprocessing step writes a weight property to every __Community__ node, counting how many chunks mention each community. The system uses these weights during retrieval to prioritize communities with richer contextual coverage, ensuring that densely referenced knowledge clusters surface first in the results.

Vector Retrieval and Query Construction

The local search implementation leverages LangChain's Neo4jVector to interface with existing vector indexes created during graph ingestion. The vector_store initialization occurs at lines 156‑159:

self.vector_store = Neo4jVector.from_existing_index(
    self.embeddings,
    url=settings.NEO4J_URI,
    username=settings.NEO4J_USER,
    password=settings.NEO4J_PASSWORD,
    index_name=self.index_name
)

This creates a wrapper around the Neo4j vector index that enables fast similarity searches against __Chunk__ node embeddings. The component also defines a Cypher template via the retrieval_query property (lines 86‑94), which contains parameterized placeholders like $topChunks and $topCommunities that get substituted with configuration values at query time.

For integration with standard LangChain chains, the as_retriever method (lines 140‑150) exposes a retriever interface that returns documents directly without LLM synthesis, useful for downstream agents that need raw context rather than generated answers.

The Local Search Execution Flow

The search method (lines 170‑190) orchestrates the complete retrieval and generation pipeline through five distinct phases:

  1. Cypher preparation – Injects top‑k limits from settings into the retrieval query template
  2. Vector similarity – Calls Neo4jVector to fetch the most similar chunk IDs for the input query
  3. Graph traversal – Executes the assembled Cypher query to retrieve chunk texts, community summaries, and relationship snippets connecting the chunks
  4. Context formatting – Populates LOCAL_SEARCH_CONTEXT_PROMPT from graphrag_agent/config/prompts.py with the retrieved graph context
  5. LLM generation – Processes the formatted prompt through self.llm and returns the natural language response via StrOutputParser()

The method constructs a LangChain expression language (LCEL) chain: prompt | self.llm | StrOutputParser(), ensuring type-safe composition of the retrieval and generation steps.

Integration with Agents and Tools

The LocalSearchTool class in graphrag_agent/search/tool/local_search_tool.py wraps the core functionality as a LangChain-compatible tool, registering it under the name "local_search" in tool_registry.py. This registration allows multi-agent executors in retrieval_executor.py to dispatch search tasks dynamically.

Higher-level agents such as GraphAgent (graphrag_agent/agents/graph_agent.py) and HybridAgent (graphrag_agent/agents/hybrid_agent.py) invoke self.local_tool.search(query) when requiring precise, graph-grounded answers rather than generic LLM knowledge. The tool handles driver lifecycle management automatically, though manual instantiation requires explicit cleanup via the close method (lines 220‑228) or context manager protocols (__enter__/__exit__).

Practical Implementation Examples

Direct LocalSearch Usage

For standalone applications, instantiate LocalSearch directly and invoke the search pipeline:

from graphrag_agent.search.local_search import LocalSearch
from graphrag_agent.llm import get_llm
from graphrag_agent.embeddings import get_embeddings

llm = get_llm()
embeddings = get_embeddings()
local = LocalSearch(llm, embeddings)

answer = local.search("What are the responsibilities of a Neo4j administrator?")
print(answer)
local.close()  # Always close the driver to release connections

Using the Registered Tool Interface

When working within the agent framework, use the tool wrapper for automatic registry integration:

from graphrag_agent.search.tool.local_search_tool import LocalSearchTool

local_tool = LocalSearchTool()
result = local_tool.search("Explain the relationship between entities A and B.")
print(result)
local_tool.close()

Exposing a LangChain Retriever

To use local search as a document retriever in other chains:

retriever = LocalSearch(llm, embeddings).as_retriever()
docs = retriever.get_relevant_documents("graph traversal algorithms")

for doc in docs:
    print(doc.page_content)

Summary

  • Local search combines Neo4j vector similarity with community-weighted Cypher queries to retrieve precise context from the knowledge graph.
  • The _init_community_weights method preprocesses community nodes to prioritize densely referenced knowledge clusters during retrieval.
  • LangChain's Neo4jVector provides the vector store interface, while the search method orchestrates retrieval, context assembly, and LLM generation.
  • Configuration parameters including top_chunks and top_communities are managed through LOCAL_SEARCH_SETTINGS in the settings module.
  • The component exposes both a high-level search interface for end-to-end generation and an as_retriever method for raw document retrieval.
  • Resource cleanup is handled through close or context manager protocols to prevent Neo4j driver connection leaks.

Frequently Asked Questions

What is the difference between local search and global search in GraphRAG?

Local search retrieves specific chunks, communities, and relationships directly connected to the query vector, making it ideal for precise, fact-based questions. Global search, by contrast, typically operates across the entire graph structure or uses community summaries at higher hierarchical levels, better suited for broad thematic questions requiring synthesis across disconnected graph regions.

How does community weighting improve search results?

The _init_community_weights method calculates how many chunks reference each community node, storing this as a weight property. During retrieval, the Cypher query uses these weights to prioritize communities with richer contextual coverage, ensuring that answers draw from the most substantiated knowledge clusters rather than sparsely connected peripheral nodes.

Can I use LocalSearch without the full agent framework?

Yes. The LocalSearch class functions independently of the agent orchestration layer. You can instantiate it directly with an LLM and embeddings, call search() for generated answers, or use as_retriever() to obtain raw documents for custom processing. Just ensure you call close() or use the context manager to release Neo4j connections.

What Neo4j version is required for the vector index functionality?

The implementation requires Neo4j 5.x or later with the GDS (Graph Data Science) library or vector index capabilities enabled, as it relies on Neo4jVector.from_existing_index() to connect to vector indexes created during the graph ingestion phase. The vector dimension must match the embedding model configured in your settings.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →