how-to-guide

How to Use Docling for OCR and Document Parsing in OpenRAG

March 13, 2026 langflow-ai/openrag ↗

OpenRAG delegates OCR and document parsing to docling-serve, a local HTTP service managed by the DoclingManager and consumed via the DoclingClient utility.

OpenRAG does not bundle its own OCR engine. Instead, it integrates with Docling through a lightweight service architecture that handles document conversion and text extraction. This guide explains how to configure, start, and use Docling for OCR and document parsing in OpenRAG based on the actual implementation in the langflow-ai/openrag repository.

Architecture Overview

The integration relies on three core components that communicate with a running docling-serve instance:

DoclingClient (src/utils/docling_client.py) – Async HTTP wrapper that sends files to the /v1/convert/file endpoint and returns parsed JSON.
DoclingManager (src/tui/managers/docling_manager.py) – Singleton process manager that starts, monitors, and persists the docling-serve subprocess across TUI sessions.
Docling API Proxy (src/api/docling.py) – FastAPI health check endpoint that forwards requests to the running service.

Starting and Managing the Docling Service

Using the CLI Controller

OpenRAG provides a command-line interface for managing the Docling service lifecycle through scripts/docling_ctl.py:


# Start docling-serve on default port 5001

python -m scripts.docling_ctl start

# Start with multiple workers and UI enabled

python -m scripts.docling_ctl start --workers 2 --enable-ui

# Check service status

python -m scripts.docling_ctl status

# Stop the service

python -m scripts.docling_ctl stop

Process Persistence and Lifecycle Management

The DoclingManager class handles process persistence by writing the PID to ~/.openrag/tui/.docling.pid. When starting, it checks for an existing PID and reattaches to running processes rather than spawning duplicates. This allows the OCR service to survive across multiple TUI sessions without reinstalling heavy dependencies.

For CI pipelines, use the warmup script to block until healthy:

python warm_up_docling.py

This script polls the health endpoint until docling-serve responds or the timeout (controlled by DOCLING_WARMUP_TIMEOUT) expires.

Converting Documents with DoclingClient

Converting Local Files

The convert_file() function in src/utils/docling_client.py handles async HTTP POST requests to the Docling service:

import httpx
from utils.docling_client import convert_file, DoclingServeError

async def parse_pdf(file_path: str):
    async with httpx.AsyncClient() as client:
        try:
            result = await convert_file(
                file_path, 
                httpx_client=client
            )
            # result contains the parsed DoclingDocument JSON

            return result
        except DoclingServeError as e:
            print(f"Conversion failed: {e}")

Processing In-Memory Bytes

For streams or uploaded files already in memory, use convert_bytes():

from utils.docling_client import convert_bytes

async def parse_bytes(content: bytes, filename: str):
    async with httpx.AsyncClient() as client:
        document = await convert_bytes(
            content, 
            filename, 
            httpx_client=client
        )
        return document

Both methods post to {DOCLING_SERVICE_URL}/v1/convert/file and return the JSON content extracted by the OCR engine.

Health Monitoring and API Proxy

The OpenRAG frontend checks service availability through a proxy endpoint rather than contacting docling-serve directly:

// Frontend health check
fetch("/api/docling/health")
  .then(r => r.json())
  .then(data => console.log("Service status:", data));

The proxy in src/api/docling.py forwards this to the underlying service and handles timeouts gracefully, returning HTTP 503 with {status: "unhealthy"} if the service is unreachable.

Configuration and Environment Variables

Control the Docling integration through these environment variables:

Variable	Description	Default
`DOCLING_SERVE_URL`	Base URL for existing docling-serve instance	Auto-detected
`DOCLING_OCR_ENGINE`	OCR engine selection (`tesseract`, `easyocr`, etc.)	None (OCR disabled)
`DOCLING_WORKERS`	Concurrent worker processes	`1`
`DOCLING_BIND_HOST`	Network interface binding	`0.0.0.0`
`DOCLING_WARMUP_TIMEOUT`	Health check wait duration (seconds)	`120`

Setting DOCLING_SERVE_URL bypasses the local process management and connects to an external service, useful for containerized deployments.

Summary

OpenRAG uses docling-serve as an external HTTP service rather than embedding OCR directly.
The DoclingManager (src/tui/managers/docling_manager.py) handles process lifecycle and PID persistence across sessions.
The DoclingClient (src/utils/docling_client.py) provides convert_file() and convert_bytes() for async document conversion.
Environment variables like DOCLING_SERVE_URL and DOCLING_OCR_ENGINE control service location and OCR behavior.
Health monitoring flows through the API proxy (src/api/docling.py) to provide frontend visibility into service status.

Frequently Asked Questions

How do I enable OCR when starting the Docling service?

Set the DOCLING_OCR_ENGINE environment variable to your preferred engine before starting the service. For example, export DOCLING_OCR_ENGINE=tesseract enables Tesseract OCR. If this variable is unset, docling-serve runs without OCR capabilities and extracts only embedded text.

Can I use an existing Docling server instead of letting OpenRAG manage the process?

Yes. Set the DOCLING_SERVE_URL environment variable to the base URL of your running instance (e.g., http://docling-service:5001). When this variable is present, OpenRAG skips the auto-start logic in DoclingManager and connects directly to the specified endpoint for all conversion requests.

What happens if the Docling service crashes during a conversion?

The DoclingClient raises a DoclingServeError exception when it encounters connection failures, timeouts, or non-200 HTTP responses. Your application code should catch this exception and handle retries or fallback logic. The DoclingManager status command can verify if the process is still running before you attempt reconversion.

How do I process documents already loaded in memory rather than files on disk?

Use the convert_bytes() function from src/utils/docling_client.py instead of convert_file(). Pass the byte content and a filename string (used for content-type detection) along with your httpx.AsyncClient instance. This approach works for uploaded files, generated reports, or any binary stream without writing to the filesystem.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how langflow-ai/openrag works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →