How to Deploy a Knowledge Graph Constructed by HugeGraph AI: Complete Deployment Guide
You can deploy a HugeGraph AI knowledge graph using Docker-Compose for immediate production use, build from source for development workflows, or automate via the Scheduler API for CI/CD pipelines, following a three-stage pipeline of document chunking, graph extraction, and vertex vectorization.
Deploying a knowledge graph constructed by HugeGraph AI transforms unstructured documents into a queryable property graph enhanced with vector search capabilities. The apache/incubator-hugegraph-ai repository provides modular deployment options that support both interactive prototyping through a Gradio interface and headless automation via Python APIs.
Understanding the Knowledge Graph Construction Pipeline
Before deploying, understand the three-stage pipeline implemented in the HugeGraph AI source code:
-
Stage 1: Text → Vector → Chunk Index: Raw documents are split into chunks and embedded using
read_documentsinhugegraph_llm/utils/vector_index_utils.py, storing vectors in Milvus or Qdrant for semantic search. -
Stage 2: Text → Graph Extraction: An LLM extracts vertices and edges according to a user-defined schema via
extract_graphinhugegraph_llm/utils/graph_index_utils.py, constructing a property graph structure. -
Stage 3: Graph → Vertex-Vector Index: Every graph vertex is embedded and indexed via
update_vid_embeddinginhugegraph_llm/utils/graph_index_utils.pyto enable graph-enhanced RAG retrieval.
Deployment Methods
Docker-Compose Deployment (Recommended)
The fastest way to deploy a complete stack uses the Docker Compose configuration defined in docker/docker-compose-network.yml.
- Clone the repository and configure environment variables:
git clone https://github.com/apache/incubator-hugegraph-ai.git
cd incubator-hugegraph-ai
cp docker/env.template docker/.env
# Edit docker/.env to set PROJECT_PATH to your local checkout path
- Start the services:
cd docker
docker compose -f docker-compose-network.yml up -d
This command launches two containers:
hugegraph-server(imagehugegraph/hugegraph) exposing port 8080 for the REST API and Hubble UIhugegraph-llm(built fromdocker/Dockerfile.llm) exposing port 8001 for the Gradio-based KG construction interface
- Verify the deployment:
docker compose -f docker-compose-network.yml ps
- Access the services:
- HugeGraph Server:
http://localhost:8080 - RAG/KG Service:
http://localhost:8001
Use the Gradio UI to click Import into Vector, Extract Graph Data, and Update Vid Embedding to complete the three-stage pipeline.
Build and Run from Source
For development or customization requiring code modifications:
- Start the graph server:
docker run -itd --name=server -p 8080:8080 hugegraph/hugegraph
- Install Python dependencies using
uv:
git clone https://github.com/apache/incubator-hugegraph-ai.git
cd incubator-hugegraph-ai
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync --extra llm
The uv sync --extra llm command installs LLM-related packages including litellm, gradio, and vector database clients. For Milvus or Qdrant support, run uv sync --extra vectordb.
- Launch the demo application:
source .venv/bin/activate
python -m hugegraph_llm.demo.rag_demo.app
The Gradio interface starts on 127.0.0.1:8001, allowing you to construct the knowledge graph through the web UI.
Programmatic Deployment for CI/CD
Automate knowledge graph construction without manual UI interaction using the SchedulerSingleton API from hugegraph_llm/flows/scheduler.py and flow definitions from hugegraph_llm/flows/__init__.py:
from hugegraph_llm.flows.scheduler import SchedulerSingleton
from hugegraph_llm.flows import FlowName
scheduler = SchedulerSingleton.get_instance()
# Stage 1: Index document chunks
scheduler.schedule_flow(FlowName.IMPORT_CHUNKS, docs=["document.txt"])
# Stage 2: Extract graph structure
schema = {
"vertex_labels": [{"name": "Person", "properties": ["name", "birth"]}],
"edge_labels": [{"name": "knows", "source_label": "Person", "target_label": "Person"}]
}
scheduler.schedule_flow(
FlowName.GRAPH_EXTRACT,
schema=schema,
texts=["...raw text content..."],
example_prompt="Extract entities and relationships.",
graph_type="property_graph"
)
# Stage 3: Persist to HugeGraph and index vertices
scheduler.schedule_flow(FlowName.IMPORT_GRAPH_DATA, data="auto", schema=schema)
scheduler.schedule_flow(FlowName.UPDATE_VID_EMBEDDINGS)
Step-by-Step Deployment Examples
Complete Docker-Compose Workflow
After executing docker compose -f docker-compose-network.yml up -d:
- Navigate to
http://localhost:8001in your browser - Paste source documents into the text area and click Import into Vector (executes the
import_chunksflow viavector_index_utils.py) - Define a JSON schema specifying vertex and edge labels, then click Extract Graph Data (triggers
graph_extractflow viaextract_graph) - Click Load into GraphDB to persist the property graph to the HugeGraph server at
localhost:8080 - Click Update Vid Embedding to populate the graph-vid vector index using
update_vid_embeddingingraph_index_utils.py
Python CI/CD Deployment Script
For automated deployments in production pipelines:
# deploy_kg.py
import os
from hugegraph_llm.flows.scheduler import SchedulerSingleton
from hugegraph_llm.flows import FlowName
scheduler = SchedulerSingleton.get_instance()
# Load source documents
with open("data/corpus.txt", "r") as f:
raw_text = f.read()
# Execute complete pipeline
scheduler.schedule_flow(FlowName.IMPORT_CHUNKS, raw_text)
scheduler.schedule_flow(
FlowName.GRAPH_EXTRACT,
schema={
"vertex_labels": [{"name": "Article", "properties": ["title"]}],
"edge_labels": [{"name": "cites", "source_label": "Article", "target_label": "Article"}]
},
texts=[raw_text]
)
scheduler.schedule_flow(FlowName.IMPORT_GRAPH_DATA, data="auto", schema=schema)
scheduler.schedule_flow(FlowName.UPDATE_VID_EMBEDDINGS)
print("✅ Knowledge graph deployed and ready for queries")
Execute the script after activating the virtual environment:
source .venv/bin/activate
python deploy_kg.py
Summary
- Three-stage architecture: Document chunking via
vector_index_utils.py, graph extraction viagraph_index_utils.py, and vertex vectorization viaupdate_vid_embedding. - Docker-Compose deployment: Use
docker/docker-compose-network.ymlto launch the complete stack with HugeGraph server on port 8080 and the RAG service on port 8001. - Source-based deployment: Install with
uv sync --extra llmand runpython -m hugegraph_llm.demo.rag_demo.appfor development environments. - Programmatic automation: Use
SchedulerSingleton.schedule_flow()withFlowNameenums to build and deploy knowledge graphs in CI/CD pipelines. - Core utilities: The
hugegraph_llm/utils/graph_index_utils.pymodule handles graph data import and embedding updates, whilehugegraph_llm/utils/vector_index_utils.pymanages document processing.
Frequently Asked Questions
What are the system requirements for deploying HugeGraph AI?
For Docker-Compose deployments, allocate at least 4GB RAM for the HugeGraph server container and 2GB for the LLM service. The server requires port 8080 for the REST API and Hubble UI, while the RAG interface uses port 8001. Production deployments with large vector indexes require additional disk space for Milvus or Qdrant data volumes configured in docker/env.template.
Can I deploy HugeGraph AI without Docker?
Yes. Run the HugeGraph server independently using the hugegraph/hugegraph Docker image or a native installation, then install the Python components from the apache/incubator-hugegraph-ai repository using uv sync --extra llm. Activate the virtual environment and launch the web interface with python -m hugegraph_llm.demo.rag_demo.app to access the Gradio UI on localhost:8001.
How do I update an existing knowledge graph with new data?
Use the Scheduler API to incrementally process additional documents. Call scheduler.schedule_flow(FlowName.IMPORT_CHUNKS, new_documents) followed by FlowName.GRAPH_EXTRACT with your existing schema definition. Then execute FlowName.IMPORT_GRAPH_DATA to merge new vertices and edges into the existing graph, and FlowName.UPDATE_VID_EMBEDDINGS to refresh the vector indexes for graph-RAG retrieval.
Where is the knowledge graph data physically stored?
The property graph persists in the HugeGraph server (port 8080), which uses RocksDB or other configurable backends. Vector embeddings for document chunks and graph vertices are stored separately in your configured vector database—either Milvus, Qdrant, or built-in stores—managed through vector_index_utils.py and indexed via the update_vid_embedding function in graph_index_utils.py according to the apache/incubator-hugegraph-ai source code.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →