# What Is the Role of Neo4j in the GraphRAG Agent Project?

> Discover how Neo4j powers the GraphRAG Agent project by acting as its graph database and vector store for knowledge graph construction, hybrid search, and more.

- Repository: [GLK/graph-rag-agent](https://github.com/1517005260/graph-rag-agent)
- Tags: how-to-guide
- Published: 2026-02-22

---

**Neo4j serves as the centralized graph database and vector store for the GraphRAG Agent, providing a singleton driver that powers knowledge graph construction, hybrid semantic search, community detection, and evaluation metrics across the entire system.**

The GraphRAG Agent (1517005260/graph-rag-agent) relies on Neo4j as its primary data backbone, storing structured entities, relationships, and vector embeddings in a single unified graph. By abstracting connection management behind a singleton pattern, the project ensures consistent, high-performance access to both raw Cypher queries and LangChain-compatible graph operations.

## Centralized Connection Management via Singleton Pattern

At the core of the integration lies the `DBConnectionManager` class defined in [`graphrag_agent/config/neo4jdb.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/config/neo4jdb.py). This singleton manages both a native Neo4j driver (`self.driver`) for raw Cypher execution and a LangChain `Neo4jGraph` instance (`self.graph`) for high-level LLM integrations.

Every component—from graph writers to FastAPI routers—accesses Neo4j through the `get_db_manager()` factory function. This design guarantees connection pooling, session reuse, and a single source of truth for all database operations.

```python
from graphrag_agent.config.neo4jdb import get_db_manager

# Obtain the manager (singleton)

db_manager = get_db_manager()

# Native Neo4j driver (for raw Cypher)

driver = db_manager.get_driver()

# LangChain‑compatible Neo4jGraph (for vector / LLM integration)

graph = db_manager.get_graph()

```

*Source:* [`graphrag_agent/config/neo4jdb.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/config/neo4jdb.py) implements `get_driver()` and `get_graph()` methods【/graphrag_agent/config/neo4jdb.py#L51-L59】.

## Knowledge Graph Construction and Persistence

Neo4j acts as the persistent store for extracted entities and relationships. The `GraphWriter` class in [`graphrag_agent/graph/extraction/graph_writer.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/graph/extraction/graph_writer.py) utilizes the singleton driver to execute Cypher `CREATE` statements, transforming parsed documents into a traversable knowledge graph.

```python
from graphrag_agent.graph.extraction.graph_writer import GraphWriter
from graphrag_agent.config.neo4jdb import get_db_manager

writer = GraphWriter()
db = get_db_manager()
writer.set_connection(db.get_driver())

# Assume `entities` is a list of dicts with keys `id`, `name`, `type`

writer.write_entities(entities)

```

*Source:* `GraphWriter` uses the driver to execute CREATE statements【/graphrag_agent/graph/extraction/graph_writer.py#L31-L78】.

## Hybrid Vector and Graph Retrieval

Neo4j enables **hybrid search** that combines semantic vector similarity with graph traversal. The system stores embeddings as node properties and leverages `Neo4jVector.from_existing_index` (LangChain) alongside custom Cypher queries. The `HybridTool` class in [`graphrag_agent/search/tool/hybrid_tool.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/search/tool/hybrid_tool.py) demonstrates this by querying vector indexes and then traversing relationships to find connected entities.

```python
from graphrag_agent.search.tool.hybrid_tool import HybridTool
from graphrag_agent.config.neo4jdb import get_db_manager

db = get_db_manager()
driver = db.get_driver()

query = """
CALL db.index.vector.queryNodes('embedding-index', $k, $vector) YIELD node AS n, score
MATCH (n)-[:MENTIONS]->(e:Entity) RETURN e.id AS entity_id, score
ORDER BY score DESC LIMIT $k
"""

params = {"k": 5, "vector": query_embedding}
result = driver.execute_query(query, params)

for record in result:
    print(record["entity_id"], record["score"])

```

*Source:* Hybrid search tool uses the driver in `HybridTool`【/graphrag_agent/search/tool/hybrid_tool.py#L5-L12】.

## Community Detection and Graph Analytics

The project utilizes Neo4j’s Graph Data Science (GDS) capabilities for community detection and summarization. Utilities such as `create_projection` and `persist_summary` issue Cypher statements via the same singleton driver to project subgraphs, run clustering algorithms, and store community summaries back into the database.

*Source:* Community pipeline description【/graphrag_agent/community/readme.md#L24-L58】.

## Backend API and Evaluation Integration

FastAPI endpoints in [`server/routers/source.py`](https://github.com/1517005260/graph-rag-agent/blob/main/server/routers/source.py) import `get_db_manager()` to serve knowledge retrieval and visualization requests directly from Neo4j. Similarly, evaluation classes like `GraphMetrics` and `RetrievalMetrics` in [`graphrag_agent/evaluation/metrics/graph_metrics.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/evaluation/metrics/graph_metrics.py) fetch ground-truth data via `self.neo4j_client.execute_query()` to compute retrieval and graph quality scores.

```python
from graphrag_agent.evaluation.metrics.graph_metrics import GraphMetrics
from graphrag_agent.config.neo4jdb import get_db_manager

neo4j_client = get_db_manager().get_driver()
metrics = GraphMetrics(config={"neo4j_client": neo4j_client})

# Example: evaluate community cohesion for a given query

score = metrics.community_cohesion(query="What is the relation between Apple and Tim Cook?")
print("Community cohesion:", score)

```

*Source:* Metric class executes queries via `self.neo4j_client.execute_query`【/graphrag_agent/evaluation/metrics/graph_metrics.py#L13-L66】.

## Key Implementation Files

Understanding the following files is essential for working with the Neo4j integration:

- [`graphrag_agent/config/neo4jdb.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/config/neo4jdb.py) – Singleton connection manager providing both native driver and LangChain graph instances.
- [`graphrag_agent/graph/extraction/graph_writer.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/graph/extraction/graph_writer.py) – Handles persistence of extracted entities and relationships.
- [`graphrag_agent/search/tool/hybrid_tool.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/search/tool/hybrid_tool.py) – Implements vector-based retrieval using the Neo4j driver.
- [`graphrag_agent/search/readme.md`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/search/readme.md) – Documents the hybrid search architecture.
- [`server/routers/source.py`](https://github.com/1517005260/graph-rag-agent/blob/main/server/routers/source.py) – FastAPI endpoints that query Neo4j for frontend consumption.
- [`graphrag_agent/evaluation/metrics/graph_metrics.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/evaluation/metrics/graph_metrics.py) – Evaluation metrics that rely on Neo4j for ground-truth verification.
- [`graphrag_agent/community/readme.md`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/community/readme.md) – Describes the GDS-based community detection pipeline.
- [`server/utils/neo4j_batch.py`](https://github.com/1517005260/graph-rag-agent/blob/main/server/utils/neo4j_batch.py) – Batch processing utilities for large result sets.

## Summary

- **Neo4j is the single source of truth** for all structured, relational, and vector-based data in the GraphRAG Agent.
- **Singleton driver pattern** in [`graphrag_agent/config/neo4jdb.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/config/neo4jdb.py) ensures consistent, pooled connections across all components.
- **Hybrid retrieval** combines vector similarity search with graph traversals using Cypher queries.
- **Graph construction** persists LLM-extracted entities and relationships via the `GraphWriter` class.
- **Evaluation and analytics** leverage Neo4j for ground-truth retrieval and GDS community detection.

## Frequently Asked Questions

### How does the GraphRAG Agent manage Neo4j connections across components?

The project implements a singleton `DBConnectionManager` in [`graphrag_agent/config/neo4jdb.py`](https://github.com/1517005260/graph-rag-agent/blob/main/graphrag_agent/config/neo4jdb.py) that exposes `get_db_manager()` to provide a single, reusable driver instance. This eliminates connection overhead and ensures thread-safe access for graph writers, search tools, and API endpoints.

### What type of search does Neo4j enable in this architecture?

Neo4j powers **hybrid search** that merges semantic vector similarity (via `db.index.vector.queryNodes`) with graph traversal (via `MATCH` clauses). This allows the agent to retrieve relevant nodes by embedding similarity and then explore their relationships to gather contextual evidence.

### How is Neo4j used during the graph construction phase?

During construction, the `GraphWriter` class streams extracted entities and relationships into Neo4j using Cypher `CREATE` statements. The writer obtains the driver from the singleton connection manager and batches write operations to populate the knowledge graph from raw documents.

### Can Neo4j handle both structured and vector data in this system?

Yes. Neo4j stores structured graph data (entities and relations) alongside vector embeddings as node properties. This dual capability enables the GraphRAG Agent to perform both symbolic reasoning over relationships and semantic similarity matching within the same database instance.