deep-dive

How Deep Research Search Mode Functions in GraphRAG Agent

February 23, 2026 1517005260/graph-rag-agent ↗

The deep research search mode implements an iterative think-search-reason loop that combines vector knowledge base retrieval with Neo4j graph traversal to answer complex queries through multi-step reasoning.

The deep research search mode serves as the core intelligence engine within the DeepResearchAgent class in the 1517005260/graph-rag-agent repository. This architecture breaks down complex research questions into iterative sub-queries, retrieves evidence from both structured graph data and unstructured text corpora, and synthesizes comprehensive answers through a persistent thinking engine.

Architecture Overview

The deep research mode is built on a modular tool-registry architecture with distinct components handling orchestration, retrieval, and reasoning:

Component	Role	Key Implementation
DeepResearchAgent	Orchestrates the workflow and exposes public APIs (`ask`, `ask_stream`)	`graphrag_agent/agents/deep_research_agent.py`
DeepResearchTool	Performs thinking and search steps with hybrid KB/graph retrieval	`graphrag_agent/search/tool/deep_research_tool.py`
DeeperResearchTool	Enhanced wrapper adding exploration and streaming capabilities	`graphrag_agent/search/tool/deeper_research_tool.py`
ThinkingEngine	Maintains conversational state, stores executed queries and reasoning steps	`graphrag_agent/search/tool/reasoning/thinking.py`
DualPathSearcher	Executes parallel retrieval from vector store and Neo4j graph	`graphrag_agent/search/tool/reasoning/search.py`
QueryGenerator	Creates sub-queries and follow-up queries from LLM output	Instantiated in `DeepResearchTool.__init__`
AnswerValidator	Checks whether generated answers satisfy the original question	Attached in `DeepResearchTool.__init__`

Execution Flow

The deep research mode operates through a structured iterative loop defined in DeepResearchTool.thinking():

Initialization – The ThinkingEngine receives the user query and generates an initial reasoning prompt (initial_thinking at line 469).
Iterative Loop – While iteration < max_iterations (line 75), the system executes:
- Query Generation – QueryGenerator produces sub-queries based on current reasoning state.
- Dual-Path Search – DualPathSearcher.search() executes _kb_retrieve (vector store) and _kg_retrieve (Neo4j graph) in parallel (line 617).
- Result Integration – Retrieved texts append to self.all_retrieved_info (line 730).
- Reasoning Update – The engine records steps via add_reasoning_step and determines if follow-up queries are needed (lines 556-567).
Answer Synthesis – _generate_final_answer (line 940) produces the comprehensive response after sufficient information collection.
Caching – The CacheManager checks for existing results using keys like deep:{query} (line 1026) to avoid redundant research cycles.

Deep vs. Deeper Mode

The repository provides two operational tiers:

Standard Mode (DeepResearchTool) provides the base multi-step workflow with think-search-reason loops and dual-path retrieval.

Enhanced Mode (DeeperResearchTool) wraps the base tool and adds:

Exploration Tool – get_exploration_tool() for community-aware knowledge graph expansion (line 54).
Reasoning Analysis Tool – get_reasoning_analysis_tool() for advanced logic validation (line 55).
Streaming Interface – get_stream_tool() enabling real-time token streaming (line 56).

Toggle between modes at runtime using agent.is_deeper_tool(True/False) (line 580).

Usage Examples

Basic Synchronous Query

from graphrag_agent.agents.deep_research_agent import DeepResearchAgent

# Initialize with enhanced tooling

agent = DeepResearchAgent(use_deeper_tool=True)

# Execute deep research loop

answer = agent.ask("What are the latest research trends in graph neural networks?")
print(answer)

Inspecting Reasoning Steps

answer, thinking = agent.ask_with_thinking(
    "Explain how Neo4j indexes improve query performance.",
    community_aware=False
)

print("Answer:", answer)
print("\n--- Reasoning Trace ---")
print(thinking)

Streaming Real-time Results

import asyncio
from graphrag_agent.agents.deep_research_agent import DeepResearchAgent

async def stream_research():
    agent = DeepResearchAgent(use_deeper_tool=True)
    async for chunk in agent.ask_stream("Summarize the impact of LLMs on software engineering."):
        print(chunk, end='', flush=True)

asyncio.run(stream_research())

Runtime Mode Switching


# Switch to standard mode

agent.is_deeper_tool(False)
answer = agent.ask("What is reinforcement learning?")
print(answer)

Summary

The deep research search mode implements an iterative think-search-reason loop that breaks complex queries into manageable sub-queries.
Dual-path retrieval simultaneously queries vector knowledge bases and Neo4j graph databases to gather comprehensive evidence.
The ThinkingEngine maintains conversational state across iterations, storing executed queries and reasoning steps in graphrag_agent/search/tool/reasoning/thinking.py.
DeeperResearchTool extends the base functionality with community-aware graph exploration, reasoning analysis, and real-time streaming capabilities.
The system uses intelligent caching with keys like deep:{query} to avoid redundant research cycles and improve response times.

Frequently Asked Questions

What is the difference between deep research and deeper research modes?

Deep research mode uses the standard DeepResearchTool which provides the core iterative workflow of thinking, searching, and reasoning. Deeper research mode activates DeeperResearchTool, which wraps the base tool and adds community-aware knowledge graph exploration, advanced reasoning analysis, and real-time streaming interfaces. You can toggle between these modes at runtime using agent.is_deeper_tool().

How does the dual-path retrieval system work?

The DualPathSearcher class executes parallel retrieval operations through _kb_retrieve (vector store search) and _kg_retrieve (Neo4j graph traversal). When the tool generates a sub-query, both paths execute simultaneously, and the results are integrated into self.all_retrieved_info. This ensures the reasoning engine has access to both unstructured text corpora and structured relationship data from the knowledge graph.

Can I stream the research progress in real-time?

Yes, when using DeeperResearchTool, you can access real-time streaming through the ask_stream method. This method returns an async generator that yields partial results as the LLM generates reasoning steps and searches the knowledge bases. The streaming interface is implemented in graphrag_agent/search/tool/deeper_research_tool.py and accessed via self.research_tool.get_stream_tool().

How does the system avoid redundant research cycles?

The deep research tool implements an intelligent caching mechanism through the CacheManager. Before executing research, the tool checks for existing results using cache keys formatted as deep:{query} or deeper:{query}. If cached results exist, the system returns them immediately instead of re-running the iterative think-search-reason loop, significantly improving response times for repeated or similar queries.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how 1517005260/graph-rag-agent works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →