# How to Set Up OpenSearch Connection Details for OpenRAG: Environment Variables and Client Configuration

> Configure OpenSearch connection details for OpenRAG using environment variables. Learn how OpenRAG initializes an AsyncOpenSearch client with HTTPS and basic auth for seamless integration.

- Repository: [Langflow/openrag](https://github.com/langflow-ai/openrag)
- Tags: how-to-guide
- Published: 2026-03-13

---

**OpenRAG configures OpenSearch connections exclusively through environment variables defined in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py), which initialize an `AsyncOpenSearch` client with HTTPS and basic authentication during the `AppClients` singleton startup sequence.**

OpenRAG, the open-source RAG framework maintained at `langflow-ai/openrag`, uses OpenSearch as its primary vector store backend. Setting up OpenSearch connection details requires configuring specific environment variables that the application reads at startup to establish secure, asynchronous connections. This guide walks through the exact configuration files, required variables, and initialization flow based on the current source code implementation.

## Required Environment Variables

OpenRAG centralizes all OpenSearch configuration in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py). The application expects six key environment variables, with four controlling basic connectivity:

- **`OPENSEARCH_HOST`**: Hostname of the OpenSearch node. Defaults to `localhost` if not set (line 22).
- **`OPENSEARCH_PORT`**: TCP port as an integer. Defaults to `9200` (line 23, parsed via `get_env_int`).
- **`OPENSEARCH_USERNAME`**: Basic authentication username. Defaults to `admin` (line 24).
- **`OPENSEARCH_PASSWORD`**: Basic authentication password. **Must be explicitly set** with no default for security reasons (line 25).
- **`OPENSEARCH_INDEX_NAME`**: Name of the vector index OpenRAG creates and queries. Referenced in [`src/config/config_manager.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/config_manager.py) (lines 253-254) and defined in [`settings.py`](https://github.com/langflow-ai/openrag/blob/main/settings.py) (lines 90-92).
- **`OPENSEARCH_DATA_PATH`**: Filesystem path for local OpenSearch data storage when using the bundled Docker development setup (line 196).

The core connection parameters are loaded at module initialization in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py):

```python

# src/config/settings.py (lines 22-27)

OPENSEARCH_HOST = os.getenv("OPENSEARCH_HOST", "localhost")
OPENSEARCH_PORT = get_env_int("OPENSEARCH_PORT", 9200)
OPENSEARCH_USERNAME = os.getenv("OPENSEARCH_USERNAME", "admin")
OPENSEARCH_PASSWORD = os.getenv("OPENSEARCH_PASSWORD")

```

## Client Initialization Flow

The `AppClients` singleton constructs the OpenSearch client during its `initialize` coroutine. This process consumes the environment variables defined above and configures an `AsyncOpenSearch` instance with HTTPS and compression enabled.

The client creation logic in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py) implements the following pattern:

```python

# src/config/settings.py – client initialization (lines 11-20 of initialize method)

self.opensearch = AsyncOpenSearch(
    hosts=[{"host": OPENSEARCH_HOST, "port": OPENSEARCH_PORT}],
    connection_class=AIOHttpConnection,
    scheme="https",
    use_ssl=True,
    verify_certs=False,
    ssl_assert_fingerprint=None,
    http_auth=(OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD),
    http_compress=True,
)

```

This configuration uses the **AIOHttpConnection** class for asynchronous I/O, forces HTTPS with SSL verification disabled (suitable for development clusters with self-signed certificates), and enables HTTP compression to reduce vector payload transfer sizes.

## Health Verification and Retry Logic

Before the application accepts traffic, [`src/utils/opensearch_utils.py`](https://github.com/langflow-ai/openrag/blob/main/src/utils/opensearch_utils.py) implements an exponential backoff strategy to verify cluster health. The `wait_for_opensearch` function polls the cluster status until it reports `green` or `yellow` health states.

```python

# src/utils/opensearch_utils.py (lines 11-30)

async def wait_for_opensearch(opensearch_client, max_retries=15, base_delay=2.0, max_delay=30.0):
    ...
    if await opensearch_client.ping():
        health = await opensearch_client.cluster.health()
        if health.get("status") in ["green", "yellow"]:
            return

```

The function retries failed connections up to 15 times, starting with a 2-second delay and capping at 30 seconds. If the cluster fails to reach a healthy state within these constraints, the application startup sequence halts, preventing operations against an unready vector store.

## Configuration Examples

### Minimal .env File Configuration

Place a `.env` file in the repository root or mount it into your container. The `load_dotenv` invocation in [`settings.py`](https://github.com/langflow-ai/openrag/blob/main/settings.py) automatically loads these values at startup:

```dotenv

# .env

OPENSEARCH_HOST=opensearch.mycompany.com
OPENSEARCH_PORT=9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=SuperSecretPass123
OPENSEARCH_INDEX_NAME=openrag-index
OPENSEARCH_DATA_PATH=/var/lib/opensearch/data

```

### Programmatic Client Creation

For custom scripts or external utilities that need to interact with the same OpenSearch cluster, replicate the client's initialization logic:

```python
import os
from opensearchpy import AsyncOpenSearch
from opensearchpy._async.http_aiohttp import AIOHttpConnection

async def create_opensearch_client():
    client = AsyncOpenSearch(
        hosts=[{
            "host": os.getenv("OPENSEARCH_HOST", "localhost"),
            "port": int(os.getenv("OPENSEARCH_PORT", "9200"))
        }],
        connection_class=AIOHttpConnection,
        scheme="https",
        use_ssl=True,
        verify_certs=False,
        http_auth=(
            os.getenv("OPENSEARCH_USERNAME", "admin"),
            os.getenv("OPENSEARCH_PASSWORD")  # Must be set explicitly

        ),
        http_compress=True,
    )
    return client

```

### Waiting for Cluster Readiness

Mirror the built-in health check when writing standalone data migration or maintenance scripts:

```python
from utils.opensearch_utils import wait_for_opensearch

async def init_opensearch():
    client = await create_opensearch_client()
    await wait_for_opensearch(client)  # Retries with exponential back-off

    return client

```

### Runtime Index Name Overrides

Override the vector index name for specific workflows without modifying the global configuration:

```python
import os
from config.settings import OPENSEARCH_INDEX_NAME

# Change the index name for a one-off run

os.environ["OPENSEARCH_INDEX_NAME"] = "my-custom-index"

# Subsequent calls to get_index_name() return the new value

```

## Deployment-Specific Configuration

### Kubernetes and Helm Deployments

For production Kubernetes environments, the Helm chart at [`kubernetes/helm/openrag/templates/secrets/opensearch-secret.yaml`](https://github.com/langflow-ai/openrag/blob/main/kubernetes/helm/openrag/templates/secrets/opensearch-secret.yaml) manages connection secrets. Configure these values in your Helm values file or through sealed secrets rather than plain environment variables.

### Text User Interface (TUI) Configuration

The interactive configuration surface in [`src/tui/config_fields.py`](https://github.com/langflow-ai/openrag/blob/main/src/tui/config_fields.py) maps each OpenSearch environment variable to a form field. This allows administrators to set connection details through the TUI rather than editing configuration files directly.

### Key Integration Files

Understanding these source files helps when debugging connection issues:

- **[`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py)**: Central definition of all OpenSearch environment variables and the `AsyncOpenSearch` client factory.
- **[`src/utils/opensearch_utils.py`](https://github.com/langflow-ai/openrag/blob/main/src/utils/opensearch_utils.py)**: Health check utilities and cluster readiness polling.
- **[`src/tui/config_fields.py`](https://github.com/langflow-ai/openrag/blob/main/src/tui/config_fields.py)**: Interactive configuration mapping for the TUI.
- **[`flows/components/opensearch_multimodel.py`](https://github.com/langflow-ai/openrag/blob/main/flows/components/opensearch_multimodel.py)**: Flow component that consumes the initialized client to execute vector queries.

## Summary

- OpenRAG reads OpenSearch connection details exclusively from environment variables defined in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py), with `OPENSEARCH_PASSWORD` being the only required variable that lacks a default value.
- The `AppClients` singleton initializes an `AsyncOpenSearch` client using `AIOHttpConnection` with HTTPS and basic authentication during application startup.
- Cluster health verification occurs through `wait_for_opensearch` in [`src/utils/opensearch_utils.py`](https://github.com/langflow-ai/openrag/blob/main/src/utils/opensearch_utils.py), implementing exponential backoff until the cluster reaches `green` or `yellow` status.
- Configuration supports both file-based `.env` loading and container orchestration via Kubernetes secrets defined in the Helm templates.

## Frequently Asked Questions

### What happens if OPENSEARCH_PASSWORD is not set?

The application will fail to initialize the OpenSearch client because `OPENSEARCH_PASSWORD` defaults to `None` and is passed directly to the `http_auth` tuple. According to the source code in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py) (line 25), this variable has no default value for security reasons, causing the `AsyncOpenSearch` constructor to raise an authentication error during the `AppClients.initialize` sequence.

### Can I use HTTP instead of HTTPS for local development?

The current implementation in [`src/config/settings.py`](https://github.com/langflow-ai/openrag/blob/main/src/config/settings.py) hardcodes `scheme="https"` and `use_ssl=True` during client initialization. To use HTTP, you would need to modify the source code where `self.opensearch` is instantiated (lines 11-20 of the `initialize` method), though this is not recommended as it requires maintaining a fork of the repository.

### Where does OpenRAG store vector embeddings?

Vector embeddings are stored in the index specified by `OPENSEARCH_INDEX_NAME`, which defaults to `openrag-index` if not configured. This index is created automatically during initialization and is referenced by the flow components in [`flows/components/opensearch_multimodel.py`](https://github.com/langflow-ai/openrag/blob/main/flows/components/opensearch_multimodel.py) when executing similarity searches.

### How do I clear the OpenSearch data for a fresh start?

Use the utility script at [`scripts/clear_opensearch_data.py`](https://github.com/langflow-ai/openrag/blob/main/scripts/clear_opensearch_data.py), which references `OPENSEARCH_DATA_PATH` to locate and wipe the local data directory. This script is designed specifically for development environments running the bundled Docker OpenSearch instance, not for production clusters.