how-to-guide

How to Set Up the OpenRAG Backend with FastAPI: Complete Configuration Guide

March 13, 2026 langflow-ai/openrag ↗

You can set up the OpenRAG backend by configuring environment variables in src/config/settings.py and running uvicorn src.main:create_app --factory, which initializes the FastAPI application with lazy-loaded clients, dynamic service wiring, and automated OpenSearch index creation.

The OpenRAG backend is a production-ready FastAPI application that orchestrates document ingestion, semantic search, and chat integrations. This guide walks you through the exact configuration steps and code structure found in the langflow-ai/openrag repository to deploy your own RAG server.

Prerequisites and Environment Configuration

Before starting the server, you must define the runtime environment. In src/config/settings.py (lines 22‑59), the application loads critical variables that control database connections, authentication, and ingestion behavior.

Export the required variables in your shell:

export OPENSEARCH_HOST=localhost
export OPENSEARCH_PORT=9200
export OPENSEARCH_USERNAME=admin
export OPENSEARCH_PASSWORD=admin
export LANGFLOW_URL=http://localhost:7860
export DISABLE_INGEST_WITH_LANGFLOW=false
export SESSION_SECRET=$(openssl rand -hex 16)

These settings drive the AppClients class in src/config/settings.py (lines 31‑43), which lazily initializes the OpenSearch client, Langflow HTTP client, and patched OpenAI/Litellm clients only when first accessed. This pattern prevents unnecessary network connections during import time and allows for proper async lifecycle management.

FastAPI Application Factory

The entry point for the backend is src/main.py, which exposes a factory function rather than a module-level instance. This design enables proper dependency injection and testability.

Service Initialization and Client Wiring

When create_app() executes, it immediately calls initialize_services() (lines 95‑130 in src/main.py). This function constructs all domain services—DocumentService, SearchService, ChatService, and TaskService—while injecting the ConnectorRouter that determines whether to use Langflow or the traditional OpenRAG ingestion pipeline based on the DISABLE_INGEST_WITH_LANGFLOW flag.

The service layer relies on the global clients object from src/utils/app_clients.py to share persistent connections across requests. This centralized client management ensures that OpenSearch, Langflow, and LLM provider connections are reused efficiently throughout the application lifetime.

Dynamic Connector Routing

The src/api/router.py file contains the upload_ingest_router endpoint, which implements dynamic request forwarding. When a document upload arrives, the router checks the DISABLE_INGEST_WITH_LANGFLOW environment variable:

If false, it delegates to _langflow_upload_ingest_task for Langflow-based processing
If true, it routes to the classic OpenRAG upload handler (api.upload.upload)

This pattern extends to chat and knowledge-filter endpoints through FastAPI’s Depends mechanism, allowing you to switch ingestion modes without code changes or redeployment.

Startup and Shutdown Lifecycle

The create_app() function attaches critical event handlers to the FastAPI instance. During startup (startup_tasks), the application:

Waits for OpenSearch availability via wait_for_opensearch()
Creates the search index using init_index() based on your configured embedding model
Loads persisted connector connections and reapplies Langflow flow settings if a reset is detected
Initializes the background task scheduler for periodic maintenance

During shutdown (shutdown_event), the backend gracefully:

Cancels active webhook subscriptions
Stops the background task cleanup scheduler
Releases all async clients (OpenSearch, Langflow, docling, OpenAI)
Sends a final telemetry event via TelemetryClient.send_event

Running the OpenRAG Server

With environment variables configured, launch the server using uvicorn with the factory pattern:

uvicorn src.main:create_app --factory --host 0.0.0.0 --port 8000

The --factory flag instructs uvicorn to treat create_app as a callable that returns an application instance rather than a pre-instantiated module variable. This ensures that all startup tasks execute in the correct async context.

Verifying the Installation

Test the deployment using the built-in health endpoints and sample API calls.

Check liveness:

curl http://localhost:8000/health

Expected response:

{"status":"ok"}

Verify OpenSearch connectivity (readiness probe):

curl http://localhost:8000/search/health

Expected response:

{"status":"ready","dependencies":{"opensearch":"up"}}

Upload a document through the dynamic router:

curl -X POST "http://localhost:8000/router/upload_ingest" \
  -F "file=@/path/to/report.pdf" \
  -F "session_id=mysession" \
  -F "delete_after_ingest=true" \
  -F "replace_duplicates=true"

Test the search API:

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{"query":"What is Retrieval-Augmented Generation?"}'

Summary

Configuration is environment-driven: All critical settings (OpenSearch hosts, Langflow URLs, OAuth credentials) load from src/config/settings.py at startup.
Use the factory pattern: Run uvicorn src.main:create_app --factory to properly initialize the FastAPI application with full service wiring and lifecycle management.
Clients are lazy-loaded: The AppClients class in src/utils/app_clients.py defers expensive connection setup until first use, improving startup times.
Routing is dynamic: The upload_ingest_router in src/api/router.py switches between Langflow and native ingestion based on the DISABLE_INGEST_WITH_LANGFLOW flag.
Lifecycle management is automatic: Startup tasks handle OpenSearch index creation and connector initialization, while shutdown handlers ensure clean client disconnection.

Frequently Asked Questions

What environment variables are required to start OpenRAG?

At minimum, you must set OPENSEARCH_HOST, OPENSEARCH_PORT, OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD, and SESSION_SECRET. If using Langflow integration, include LANGFLOW_URL and DISABLE_INGEST_WITH_LANGFLOW=false. These are parsed in src/config/settings.py (lines 22‑59) and used to configure the global clients object.

How do I switch between Langflow and native document ingestion?

Set the DISABLE_INGEST_WITH_LANGFLOW environment variable. When false (default), the upload_ingest_router in src/api/router.py delegates uploads to the Langflow pipeline. When true, it routes to the native OpenRAG handler. This check occurs at request time, allowing runtime switching without code changes.

How does the backend handle OpenSearch connectivity?

During startup (startup_tasks in src/main.py), the application calls wait_for_opensearch() to verify connectivity before accepting requests. The init_index() function then creates or updates the search index based on your embedding model configuration. If OpenSearch is unavailable, the readiness probe at /search/health returns a non-ready status.

Why use the --factory flag with uvicorn?

The --factory flag tells uvicorn that src.main:create_app is a function returning a FastAPI instance, not the instance itself. This is required because create_app() performs initialization work—like calling initialize_services() and attaching startup/shutdown handlers—that must execute within the server's async context rather than at module import time.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how langflow-ai/openrag works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →