Requirements to Run the Apache HugeGraph AI Project: Complete Setup Guide

To run the Apache HugeGraph AI project, you need Python 3.10+, UV package manager 0.7+, HugeGraph Server 1.3+, and optionally Docker 20.10+ for containerized deployment.

The Apache HugeGraph AI repository is a multi-module workspace combining graph database capabilities, large-language-model (LLM) tooling, and graph machine learning utilities. Understanding the requirements to run the Apache HugeGraph AI project ensures you install only the dependencies necessary for your specific use case, whether you are building RAG applications or training graph neural networks.

Core System Requirements

Before installing any Python packages, verify your environment meets these baseline specifications.

Python Runtime

The project requires Python 3.10 or higher. This is the minimum version supported across all sub-modules, including hugegraph-llm and hugegraph-python-client. While some modules may run on older versions, the workspace-wide configuration in pyproject.toml mandates 3.10+ for compatibility with modern async features and type hinting used throughout the codebase.

Package Manager

You must install UV 0.7 or higher, the Rust-based Python package manager. The project uses UV's workspace functionality to resolve dependencies across multiple local packages. Unlike pip, UV handles the editable path dependencies defined in the [tool.uv.workspace] section of pyproject.toml, ensuring that changes in hugegraph-llm or hugegraph-python-client reflect immediately without reinstallation.

Graph Database Backend

A running HugeGraph Server instance (version 1.3+, 1.5+ recommended) is mandatory. The Python client modules communicate with this server via REST API for graph storage and Gremlin query execution. You can deploy this via Docker using the provided docker-compose-network.yml or install it natively on your host system.

Workspace Structure and Dependencies

The repository organizes code into a UV workspace with mandatory and optional members.

Mandatory Modules

According to the pyproject.toml at the repository root, the workspace includes two required members:

  • hugegraph-llm – Core LLM integration and RAG flows
  • hugegraph-python-client – Low-level client for HugeGraph Server communication

These are defined under [tool.uv.workspace] and are always installed when you run uv sync.

Optional Modules

Additional functionality is available through editable path dependencies:

  • hugegraph-ml – Graph machine learning utilities
  • vermeer-python-client – Integration with the Vermeer compute engine

These are not included in the base install but can be added via specific extras flags.

Optional Dependencies and Feature Extras

The pyproject.toml defines four optional dependency groups under [project.optional-dependencies]. Install only what you need to keep your environment lightweight.

LLM Support (--extra llm)

Installs providers and tools for large language model integration:

  • openai, ollama, litellm – LLM API clients
  • tiktoken – Token counting for OpenAI models
  • gradio – Web interface for demos

Vector Database Support (--extra vectordb)

Required for vector-based RAG retrieval:

  • faiss-cpu – Facebook AI Similarity Search
  • pyarrow, openpyxl – Data handling utilities

Machine Learning Support (--extra ml)

Installs deep learning frameworks for graph neural networks:

  • torch, dgl (Deep Graph Library)
  • ogb (Open Graph Benchmark)
  • catboost – Gradient boosting framework

Vermeer Compute Engine (--extra vermeer)

Adds the vermeer-python-client package for distributed graph computing capabilities.

To install all extras simultaneously, use uv sync --all-extras.

Environment Configuration

The project requires a .env file for runtime configuration, particularly for Docker deployments.

Required Environment Variables

Copy the template from docker/env.template to docker/.env and set:

  • PROJECT_PATH – Absolute path to your cloned repository root
  • LANGUAGE – Interface language selection (e.g., en or zh)
  • Vector database and LLM provider credentials

The application watches this file and reloads prompts automatically when changes are detected.

Step-by-Step Installation

Follow these commands to set up the project from source:

  1. Clone the repository

    git clone https://github.com/apache/incubator-hugegraph-ai.git
    cd incubator-hugegraph-ai
  2. Install UV (if not present)

    curl -LsSf https://astral.sh/uv/install.sh | sh
  3. Configure environment

    cp docker/env.template docker/.env
    echo "PROJECT_PATH=$(pwd)" >> docker/.env
  4. Install dependencies (replace llm with all-extras if needed)

    uv sync --extra llm
  5. Activate virtual environment

    source .venv/bin/activate

Running the Application

You have two primary methods to launch the services.

Docker Deployment

For the quickest start, use Docker Compose to spin up both HugeGraph Server and the RAG service:

cd docker
docker compose -f docker-compose-network.yml up -d

This creates a shared network allowing the Python services to communicate with the graph database at http://hugegraph-server:8080.

Source Execution

After activating your virtual environment, launch the web demo directly:

python -m hugegraph_llm.demo.rag_demo.app

Access the interface at http://127.0.0.1:8001.

Programmatic Usage Examples

Once installed, interact with the system programmatically using the SchedulerSingleton class from hugegraph_llm.flows.scheduler.

Execute a RAG Query

from hugegraph_llm.flows.scheduler import SchedulerSingleton

scheduler = SchedulerSingleton.get_instance()
result = scheduler.schedule_flow(
    "rag_graph_only",
    query="What movies starred Al Pacino?",
    graph_only_answer=True,
    vector_only_answer=False,
)

print("Graph answer:", result.get("graph_only_answer"))

Build a Vector Index

Required before running vector-only RAG:

from hugegraph_llm.flows.scheduler import SchedulerSingleton

examples = [
    {"id": "question1", "gremlin": "g.V().hasLabel('person').valueMap()"},
    {"id": "question2", "gremlin": "g.V().has('name','Alice').out('knows')"},
]

idx_res = SchedulerSingleton.get_instance().schedule_flow(
    "build_examples_index", examples
)
print("Index build result:", idx_res)

Run Graph ML Training

With the ml extra installed:

uv sync --extra ml
source .venv/bin/activate
python hugegraph-ml/src/examples/node_classify.py

Summary

  • Python 3.10+, UV 0.7+, and HugeGraph Server 1.3+ are mandatory core requirements.
  • Install feature-specific dependencies using uv sync --extra llm, --extra ml, or --extra vectordb to avoid bloat.
  • Copy docker/env.template to docker/.env and set PROJECT_PATH before running.
  • Use Docker Compose (docker-compose-network.yml) for integrated server/client deployment, or run the RAG demo directly via python -m hugegraph_llm.demo.rag_demo.app.
  • Programmatic access flows through SchedulerSingleton in hugegraph_llm/flows/scheduler.py.

Frequently Asked Questions

What is the minimum Python version required for Apache HugeGraph AI?

The project requires Python 3.10 or higher. This requirement is enforced in the root pyproject.toml to ensure compatibility with type hints and async features used across the hugegraph-llm and hugegraph-python-client modules.

Is Docker mandatory for running the project?

No, Docker is optional but recommended for quick starts. You can run the HugeGraph Server natively and execute Python components directly using uv run or an activated virtual environment. However, the docker-compose-network.yml file provides the simplest way to ensure network connectivity between the graph database and AI services.

What is the difference between uv sync --extra llm and uv sync --all-extras?

The --extra llm flag installs only the dependencies needed for large language model integration (OpenAI, Ollama, Gradio, etc.), keeping the installation lightweight. The --all-extras flag installs every optional dependency group, including machine learning frameworks (PyTorch, DGL) and Vermeer client libraries, which is necessary only if you plan to use all features simultaneously.

How do I configure the project path for Docker deployments?

You must set the PROJECT_PATH variable in docker/.env. Copy the template file using cp docker/env.template docker/.env, then append your absolute repository path with echo "PROJECT_PATH=$(pwd)" >> docker/.env. This allows the Docker containers to mount your local source code correctly.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →