How to Set Up the OpenRAG Backend with FastAPI: Complete Configuration Guide
You can set up the OpenRAG backend by configuring environment variables in src/config/settings.py and running uvicorn src.main:create_app --factory, which initializes the FastAPI application with lazy-loaded clients, dynamic service wiring, and automated OpenSearch index creation.
The OpenRAG backend is a production-ready FastAPI application that orchestrates document ingestion, semantic search, and chat integrations. This guide walks you through the exact configuration steps and code structure found in the langflow-ai/openrag repository to deploy your own RAG server.
Prerequisites and Environment Configuration
Before starting the server, you must define the runtime environment. In src/config/settings.py (lines 22‑59), the application loads critical variables that control database connections, authentication, and ingestion behavior.
Export the required variables in your shell:
export OPENSEARCH_HOST=localhost
export OPENSEARCH_PORT=9200
export OPENSEARCH_USERNAME=admin
export OPENSEARCH_PASSWORD=admin
export LANGFLOW_URL=http://localhost:7860
export DISABLE_INGEST_WITH_LANGFLOW=false
export SESSION_SECRET=$(openssl rand -hex 16)
These settings drive the AppClients class in src/config/settings.py (lines 31‑43), which lazily initializes the OpenSearch client, Langflow HTTP client, and patched OpenAI/Litellm clients only when first accessed. This pattern prevents unnecessary network connections during import time and allows for proper async lifecycle management.
FastAPI Application Factory
The entry point for the backend is src/main.py, which exposes a factory function rather than a module-level instance. This design enables proper dependency injection and testability.
Service Initialization and Client Wiring
When create_app() executes, it immediately calls initialize_services() (lines 95‑130 in src/main.py). This function constructs all domain services—DocumentService, SearchService, ChatService, and TaskService—while injecting the ConnectorRouter that determines whether to use Langflow or the traditional OpenRAG ingestion pipeline based on the DISABLE_INGEST_WITH_LANGFLOW flag.
The service layer relies on the global clients object from src/utils/app_clients.py to share persistent connections across requests. This centralized client management ensures that OpenSearch, Langflow, and LLM provider connections are reused efficiently throughout the application lifetime.
Dynamic Connector Routing
The src/api/router.py file contains the upload_ingest_router endpoint, which implements dynamic request forwarding. When a document upload arrives, the router checks the DISABLE_INGEST_WITH_LANGFLOW environment variable:
- If
false, it delegates to_langflow_upload_ingest_taskfor Langflow-based processing - If
true, it routes to the classic OpenRAG upload handler (api.upload.upload)
This pattern extends to chat and knowledge-filter endpoints through FastAPI’s Depends mechanism, allowing you to switch ingestion modes without code changes or redeployment.
Startup and Shutdown Lifecycle
The create_app() function attaches critical event handlers to the FastAPI instance. During startup (startup_tasks), the application:
- Waits for OpenSearch availability via
wait_for_opensearch() - Creates the search index using
init_index()based on your configured embedding model - Loads persisted connector connections and reapplies Langflow flow settings if a reset is detected
- Initializes the background task scheduler for periodic maintenance
During shutdown (shutdown_event), the backend gracefully:
- Cancels active webhook subscriptions
- Stops the background task cleanup scheduler
- Releases all async clients (OpenSearch, Langflow, docling, OpenAI)
- Sends a final telemetry event via
TelemetryClient.send_event
Running the OpenRAG Server
With environment variables configured, launch the server using uvicorn with the factory pattern:
uvicorn src.main:create_app --factory --host 0.0.0.0 --port 8000
The --factory flag instructs uvicorn to treat create_app as a callable that returns an application instance rather than a pre-instantiated module variable. This ensures that all startup tasks execute in the correct async context.
Verifying the Installation
Test the deployment using the built-in health endpoints and sample API calls.
Check liveness:
curl http://localhost:8000/health
Expected response:
{"status":"ok"}
Verify OpenSearch connectivity (readiness probe):
curl http://localhost:8000/search/health
Expected response:
{"status":"ready","dependencies":{"opensearch":"up"}}
Upload a document through the dynamic router:
curl -X POST "http://localhost:8000/router/upload_ingest" \
-F "file=@/path/to/report.pdf" \
-F "session_id=mysession" \
-F "delete_after_ingest=true" \
-F "replace_duplicates=true"
Test the search API:
curl -X POST http://localhost:8000/v1/search \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"query":"What is Retrieval-Augmented Generation?"}'
Summary
- Configuration is environment-driven: All critical settings (OpenSearch hosts, Langflow URLs, OAuth credentials) load from
src/config/settings.pyat startup. - Use the factory pattern: Run
uvicorn src.main:create_app --factoryto properly initialize the FastAPI application with full service wiring and lifecycle management. - Clients are lazy-loaded: The
AppClientsclass insrc/utils/app_clients.pydefers expensive connection setup until first use, improving startup times. - Routing is dynamic: The
upload_ingest_routerinsrc/api/router.pyswitches between Langflow and native ingestion based on theDISABLE_INGEST_WITH_LANGFLOWflag. - Lifecycle management is automatic: Startup tasks handle OpenSearch index creation and connector initialization, while shutdown handlers ensure clean client disconnection.
Frequently Asked Questions
What environment variables are required to start OpenRAG?
At minimum, you must set OPENSEARCH_HOST, OPENSEARCH_PORT, OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD, and SESSION_SECRET. If using Langflow integration, include LANGFLOW_URL and DISABLE_INGEST_WITH_LANGFLOW=false. These are parsed in src/config/settings.py (lines 22‑59) and used to configure the global clients object.
How do I switch between Langflow and native document ingestion?
Set the DISABLE_INGEST_WITH_LANGFLOW environment variable. When false (default), the upload_ingest_router in src/api/router.py delegates uploads to the Langflow pipeline. When true, it routes to the native OpenRAG handler. This check occurs at request time, allowing runtime switching without code changes.
How does the backend handle OpenSearch connectivity?
During startup (startup_tasks in src/main.py), the application calls wait_for_opensearch() to verify connectivity before accepting requests. The init_index() function then creates or updates the search index based on your embedding model configuration. If OpenSearch is unavailable, the readiness probe at /search/health returns a non-ready status.
Why use the --factory flag with uvicorn?
The --factory flag tells uvicorn that src.main:create_app is a function returning a FastAPI instance, not the instance itself. This is required because create_app() performs initialization work—like calling initialize_services() and attaching startup/shutdown handlers—that must execute within the server's async context rather than at module import time.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →