How to Deploy a Knowledge Graph Constructed by HugeGraph AI: Complete Deployment Guide

You can deploy a HugeGraph AI knowledge graph using Docker-Compose for immediate production use, build from source for development workflows, or automate via the Scheduler API for CI/CD pipelines, following a three-stage pipeline of document chunking, graph extraction, and vertex vectorization.

Deploying a knowledge graph constructed by HugeGraph AI transforms unstructured documents into a queryable property graph enhanced with vector search capabilities. The apache/incubator-hugegraph-ai repository provides modular deployment options that support both interactive prototyping through a Gradio interface and headless automation via Python APIs.

Understanding the Knowledge Graph Construction Pipeline

Before deploying, understand the three-stage pipeline implemented in the HugeGraph AI source code:

  • Stage 1: Text → Vector → Chunk Index: Raw documents are split into chunks and embedded using read_documents in hugegraph_llm/utils/vector_index_utils.py, storing vectors in Milvus or Qdrant for semantic search.

  • Stage 2: Text → Graph Extraction: An LLM extracts vertices and edges according to a user-defined schema via extract_graph in hugegraph_llm/utils/graph_index_utils.py, constructing a property graph structure.

  • Stage 3: Graph → Vertex-Vector Index: Every graph vertex is embedded and indexed via update_vid_embedding in hugegraph_llm/utils/graph_index_utils.py to enable graph-enhanced RAG retrieval.

Deployment Methods

The fastest way to deploy a complete stack uses the Docker Compose configuration defined in docker/docker-compose-network.yml.

  1. Clone the repository and configure environment variables:
git clone https://github.com/apache/incubator-hugegraph-ai.git
cd incubator-hugegraph-ai
cp docker/env.template docker/.env

# Edit docker/.env to set PROJECT_PATH to your local checkout path
  1. Start the services:
cd docker
docker compose -f docker-compose-network.yml up -d

This command launches two containers:

  • hugegraph-server (image hugegraph/hugegraph) exposing port 8080 for the REST API and Hubble UI
  • hugegraph-llm (built from docker/Dockerfile.llm) exposing port 8001 for the Gradio-based KG construction interface
  1. Verify the deployment:
docker compose -f docker-compose-network.yml ps
  1. Access the services:
  • HugeGraph Server: http://localhost:8080
  • RAG/KG Service: http://localhost:8001

Use the Gradio UI to click Import into Vector, Extract Graph Data, and Update Vid Embedding to complete the three-stage pipeline.

Build and Run from Source

For development or customization requiring code modifications:

  1. Start the graph server:
docker run -itd --name=server -p 8080:8080 hugegraph/hugegraph
  1. Install Python dependencies using uv:
git clone https://github.com/apache/incubator-hugegraph-ai.git
cd incubator-hugegraph-ai
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync --extra llm

The uv sync --extra llm command installs LLM-related packages including litellm, gradio, and vector database clients. For Milvus or Qdrant support, run uv sync --extra vectordb.

  1. Launch the demo application:
source .venv/bin/activate
python -m hugegraph_llm.demo.rag_demo.app

The Gradio interface starts on 127.0.0.1:8001, allowing you to construct the knowledge graph through the web UI.

Programmatic Deployment for CI/CD

Automate knowledge graph construction without manual UI interaction using the SchedulerSingleton API from hugegraph_llm/flows/scheduler.py and flow definitions from hugegraph_llm/flows/__init__.py:

from hugegraph_llm.flows.scheduler import SchedulerSingleton
from hugegraph_llm.flows import FlowName

scheduler = SchedulerSingleton.get_instance()

# Stage 1: Index document chunks

scheduler.schedule_flow(FlowName.IMPORT_CHUNKS, docs=["document.txt"])

# Stage 2: Extract graph structure

schema = {
    "vertex_labels": [{"name": "Person", "properties": ["name", "birth"]}],
    "edge_labels": [{"name": "knows", "source_label": "Person", "target_label": "Person"}]
}
scheduler.schedule_flow(
    FlowName.GRAPH_EXTRACT,
    schema=schema,
    texts=["...raw text content..."],
    example_prompt="Extract entities and relationships.",
    graph_type="property_graph"
)

# Stage 3: Persist to HugeGraph and index vertices

scheduler.schedule_flow(FlowName.IMPORT_GRAPH_DATA, data="auto", schema=schema)
scheduler.schedule_flow(FlowName.UPDATE_VID_EMBEDDINGS)

Step-by-Step Deployment Examples

Complete Docker-Compose Workflow

After executing docker compose -f docker-compose-network.yml up -d:

  1. Navigate to http://localhost:8001 in your browser
  2. Paste source documents into the text area and click Import into Vector (executes the import_chunks flow via vector_index_utils.py)
  3. Define a JSON schema specifying vertex and edge labels, then click Extract Graph Data (triggers graph_extract flow via extract_graph)
  4. Click Load into GraphDB to persist the property graph to the HugeGraph server at localhost:8080
  5. Click Update Vid Embedding to populate the graph-vid vector index using update_vid_embedding in graph_index_utils.py

Python CI/CD Deployment Script

For automated deployments in production pipelines:


# deploy_kg.py

import os
from hugegraph_llm.flows.scheduler import SchedulerSingleton
from hugegraph_llm.flows import FlowName

scheduler = SchedulerSingleton.get_instance()

# Load source documents

with open("data/corpus.txt", "r") as f:
    raw_text = f.read()

# Execute complete pipeline

scheduler.schedule_flow(FlowName.IMPORT_CHUNKS, raw_text)
scheduler.schedule_flow(
    FlowName.GRAPH_EXTRACT,
    schema={
        "vertex_labels": [{"name": "Article", "properties": ["title"]}],
        "edge_labels": [{"name": "cites", "source_label": "Article", "target_label": "Article"}]
    },
    texts=[raw_text]
)
scheduler.schedule_flow(FlowName.IMPORT_GRAPH_DATA, data="auto", schema=schema)
scheduler.schedule_flow(FlowName.UPDATE_VID_EMBEDDINGS)

print("✅ Knowledge graph deployed and ready for queries")

Execute the script after activating the virtual environment:

source .venv/bin/activate
python deploy_kg.py

Summary

  • Three-stage architecture: Document chunking via vector_index_utils.py, graph extraction via graph_index_utils.py, and vertex vectorization via update_vid_embedding.
  • Docker-Compose deployment: Use docker/docker-compose-network.yml to launch the complete stack with HugeGraph server on port 8080 and the RAG service on port 8001.
  • Source-based deployment: Install with uv sync --extra llm and run python -m hugegraph_llm.demo.rag_demo.app for development environments.
  • Programmatic automation: Use SchedulerSingleton.schedule_flow() with FlowName enums to build and deploy knowledge graphs in CI/CD pipelines.
  • Core utilities: The hugegraph_llm/utils/graph_index_utils.py module handles graph data import and embedding updates, while hugegraph_llm/utils/vector_index_utils.py manages document processing.

Frequently Asked Questions

What are the system requirements for deploying HugeGraph AI?

For Docker-Compose deployments, allocate at least 4GB RAM for the HugeGraph server container and 2GB for the LLM service. The server requires port 8080 for the REST API and Hubble UI, while the RAG interface uses port 8001. Production deployments with large vector indexes require additional disk space for Milvus or Qdrant data volumes configured in docker/env.template.

Can I deploy HugeGraph AI without Docker?

Yes. Run the HugeGraph server independently using the hugegraph/hugegraph Docker image or a native installation, then install the Python components from the apache/incubator-hugegraph-ai repository using uv sync --extra llm. Activate the virtual environment and launch the web interface with python -m hugegraph_llm.demo.rag_demo.app to access the Gradio UI on localhost:8001.

How do I update an existing knowledge graph with new data?

Use the Scheduler API to incrementally process additional documents. Call scheduler.schedule_flow(FlowName.IMPORT_CHUNKS, new_documents) followed by FlowName.GRAPH_EXTRACT with your existing schema definition. Then execute FlowName.IMPORT_GRAPH_DATA to merge new vertices and edges into the existing graph, and FlowName.UPDATE_VID_EMBEDDINGS to refresh the vector indexes for graph-RAG retrieval.

Where is the knowledge graph data physically stored?

The property graph persists in the HugeGraph server (port 8080), which uses RocksDB or other configurable backends. Vector embeddings for document chunks and graph vertices are stored separately in your configured vector database—either Milvus, Qdrant, or built-in stores—managed through vector_index_utils.py and indexed via the update_vid_embedding function in graph_index_utils.py according to the apache/incubator-hugegraph-ai source code.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →