# How to Configure OpenRAG Using Environment Variables: A Complete Guide

> Learn how to configure OpenRAG using environment variables. Easily control vector stores, LLM providers, and more via a single .env file for seamless integration. Get the complete guide.

- Repository: [Langflow/openrag](https://github.com/langflow-ai/openrag)
- Tags: how-to-guide
- Published: 2026-03-13

---

**OpenRAG reads all configuration from environment variables loaded via python-dotenv, allowing you to control vector stores, LLM providers, OAuth connectors, and timeouts through a single `.env` file placed in the project root.**

The `langflow-ai/openrag` repository implements a strictly environment-driven configuration architecture. Every runtime parameter—from OpenSearch connection strings to LLM API keys—is resolved through environment variables parsed at startup, making the application portable across development, Docker, and production environments without code changes.

## Configuration Architecture

### Environment Loading Mechanism

In [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py), the application initializes by calling `load_dotenv(override=False)` twice: first for the current directory and then for the repository root (lines 17-19). This ensures that a `.env` file placed either next to the source code or at the project root is automatically loaded into the process environment before any configuration constants are defined.

### Type-Safe Variable Parsing

Rather than parsing raw strings manually, the codebase uses helper functions defined in [`src/utils/env_utils.py`](https://github.com/langflow-ai/openrag/blob/main/src/utils/env_utils.py). The `get_env_int()` and `get_env_float()` functions safely cast environment values to numeric types while supplying defaults when variables are missing or malformed, preventing runtime type errors.

### Configuration Manager Integration

The [`src/config/config_manager.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/config_manager.py) module merges YAML-based configuration from [`openrag.yaml`](https://github.com/langflow-ai/openrag/blob/main/openrag.yaml) (if present) with environment values. Functions like `get_openrag_config()` (lines 42-48 in [`settings.py`](https://github.com/langflow-ai/openrag/blob/main/settings.py)) expose the final parsed configuration to the rest of the application, providing a single source of truth for constants used across OpenSearch clients, Langflow HTTP clients, and Docling services.

## Essential Environment Variables

### OpenSearch Vector Store Configuration

Control your vector database connection using these variables:

- `OPENSEARCH_HOST`: Target hostname (default: `localhost`)
- `OPENSEARCH_PORT`: Service port (default: `9200`)
- `OPENSEARCH_USERNAME`: Authentication user (default: `admin`)
- `OPENSEARCH_PASSWORD`: Credentials for basic auth
- `OPENSEARCH_INDEX_NAME`: Target index for document storage
- `OPENSEARCH_DATA_PATH`: Filesystem path for OpenSearch data

### Langflow Integration Settings

Configure the Langflow orchestration layer:

- `LANGFLOW_URL`: Base URL for the Langflow instance (e.g., `http://localhost:7860`)
- `LANGFLOW_CHAT_FLOW_ID`: UUID for the chat processing flow
- `LANGFLOW_INGEST_FLOW_ID`: UUID for document ingestion flows
- `LANGFLOW_AUTO_LOGIN`: Boolean flag (default: `False`) enabling automatic authentication with default credentials
- `LANGFLOW_SUPERUSER` and `LANGFLOW_SUPERUSER_PASSWORD`: Credentials for automatic API key generation when `LANGFLOW_KEY` is not explicitly provided

### LLM and Embedding Provider Keys

The application refuses to start a provider without valid authentication:

- `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `WATSONX_API_KEY`: Provider-specific API tokens
- `OLLAMA_ENDPOINT`: Local Ollama server URL
- `WATSONX_ENDPOINT` and `WATSONX_PROJECT_ID`: IBM Watsonx configuration
- `LLM_PROVIDER` and `EMBEDDING_PROVIDER`: Selection keys (e.g., `openai`, `anthropic`, `ollama`)
- `LLM_MODEL` and `EMBEDDING_MODEL`: Specific model identifiers (e.g., `gpt-4o-mini`, `text-embedding-3-small`)

### OAuth Connectors for Cloud Storage

Enable Google Drive and Microsoft SharePoint connectors:

- `GOOGLE_OAUTH_CLIENT_ID` and `GOOGLE_OAUTH_CLIENT_SECRET`: Google Drive integration
- `MICROSOFT_GRAPH_OAUTH_CLIENT_ID` and `MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET`: OneDrive/SharePoint access

Absence of these variables disables the respective connector entirely.

### Timeouts and Performance Tuning

Adjust processing limits for large documents:

- `LANGFLOW_TIMEOUT`: Total HTTP timeout in seconds (default: `2400`, i.e., 40 minutes)
- `LANGFLOW_CONNECT_TIMEOUT`: Initial connection timeout (default: `30`)
- `INGESTION_TIMEOUT`: Per-file processing limit (default: `3600`, i.e., 1 hour)
- `UPLOAD_BATCH_SIZE`: Bulk upload chunk size
- `MAX_WORKERS`: Concurrency level for parallel processing
- `DOCLING_WORKERS`: Parallel workers for PDF OCR processing

### Feature Flags and Debug Options

Toggle functionality without code changes:

- `DISABLE_INGEST_WITH_LANGFLOW`: Set to `true` to bypass Langflow for ingestion (default: `false`)
- `INGEST_SAMPLE_DATA`: Seed the database with sample documents on startup (default: `true`)
- `WEBHOOK_BASE_URL`: Enable continuous ingestion callbacks (disabled if unset)
- `LOG_LEVEL`: Verbosity for application logging (e.g., `INFO`, `DEBUG`)
- `SERVICE_NAME`: Application identifier in logs (default: `openrag`)
- `NO_COLOR`: Disable colored terminal output
- `ACCESS_LOG`: Toggle HTTP request logging

## Practical Configuration Example

Create a `.env` file in the project root (alongside `src/` or at the repository base). Below is a production-ready template demonstrating all major configuration categories:

```dotenv

# Core services ---------------------------------------------------------

OPENSEARCH_HOST=opensearch
OPENSEARCH_PORT=9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=MyStrong!Passw0rd
OPENSEARCH_INDEX_NAME=documents

# Langflow -------------------------------------------------------------

LANGFLOW_URL=http://localhost:7860
LANGFLOW_CHAT_FLOW_ID=1098eea1-6649-4e1d-aed1-b77249fb8dd0
LANGFLOW_INGEST_FLOW_ID=5488df7c-b93f-4f87-a446-b67028bc0813
LANGFLOW_AUTO_LOGIN=True
LANGFLOW_SUPERUSER=admin
LANGFLOW_SUPERUSER_PASSWORD=admin

# OAuth connectors -------------------------------------------------------

GOOGLE_OAUTH_CLIENT_ID=YOUR_GOOGLE_CLIENT_ID
GOOGLE_OAUTH_CLIENT_SECRET=YOUR_GOOGLE_CLIENT_SECRET
MICROSOFT_GRAPH_OAUTH_CLIENT_ID=YOUR_MS_CLIENT_ID
MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET=YOUR_MS_CLIENT_SECRET

# Provider configuration -------------------------------------------------

OPENAI_API_KEY=sk-...
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

# Timeouts --------------------------------------------------------------

LANGFLOW_TIMEOUT=2400
LANGFLOW_CONNECT_TIMEOUT=30
INGESTION_TIMEOUT=3600
MAX_WORKERS=4

# Optional features ------------------------------------------------------

DISABLE_INGEST_WITH_LANGFLOW=false
INGEST_SAMPLE_DATA=true
WEBHOOK_BASE_URL=https://my-ngrok.io/webhook
LOG_LEVEL=INFO

```

When running `docker compose up` or executing `make run`, OpenRAG automatically ingests these values through the [`settings.py`](https://github.com/langflow-ai/openrag/blob/main/settings.py) initialization sequence.

## Verifying Your Configuration at Runtime

Inspect active environment variables through the Terminal User Interface (TUI). The [`src/tui/managers/env_manager.py`](https://github.com/langflow-ai/openrag/blob/main/src/tui/managers/env_manager.py) module provides a runtime view of parsed configuration, confirming that variables from your `.env` file were correctly loaded and applied according to the logic in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py).

## Summary

- OpenRAG configuration is strictly environment-driven via variables defined in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py) (lines 22-78)
- The application loads `.env` files automatically using `python-dotenv` with fallback to repository root (lines 17-19)
- Type-safe parsing occurs through [`src/utils/env_utils.py`](https://github.com/langflow-ai/openrag/blob/main/src/utils/env_utils.py) helpers (`get_env_int`, `get_env_float`)
- [`src/config/config_manager.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/config_manager.py) merges YAML files with environment values for hybrid configuration
- All sensitive credentials, LLM providers, and vector store connections are controlled through environment variables with no hardcoded defaults for security-critical settings
- Reference `.env.example` in the repository root for the canonical list of supported variables

## Frequently Asked Questions

### Does OpenRAG support file-based configuration instead of environment variables?

Yes. While environment variables are the primary mechanism, [`src/config/config_manager.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/config_manager.py) loads an optional [`openrag.yaml`](https://github.com/langflow-ai/openrag/blob/main/openrag.yaml) file and merges it with environment values. Environment variables take precedence over YAML settings, allowing you to mix both approaches when you configure OpenRAG using environment variables as the override layer.

### What happens if I omit required API keys like OPENAI_API_KEY?

The application will refuse to initialize the respective provider. In [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py), the constants are imported directly by client factories in [`src/main.py`](https://github.com/langflow-ai/openrag/blob/main/src/main.py); missing required keys cause the provider instantiation to fail gracefully with a clear error message rather than starting with invalid credentials.

### How do I change configuration without restarting the container?

You cannot change configuration without a restart. OpenRAG reads all environment variables at startup in [`settings.py`](https://github.com/langflow-ai/openrag/blob/main/settings.py) (lines 17-19) and stores them as module-level constants. Changes to the `.env` file require a container restart or process reload to take effect, as the values are not re-parsed at runtime.

### Where can I find the complete list of supported environment variables?

The `.env.example` file in the repository root contains the canonical documentation of every supported variable, its purpose, and suggested defaults. This file serves as the authoritative reference for the entire configuration surface area implemented in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py).