deep-dive

What Is the Purpose of OpenSearch in OpenRAG?

March 13, 2026 langflow-ai/openrag ↗

OpenSearch serves as the persistent storage layer, hybrid search engine, and security gatekeeper for OpenRAG, enabling both vector-based semantic retrieval and traditional keyword search while managing multi-model embeddings and enforcing access controls.

OpenRAG, the open-source retrieval-augmented generation framework maintained by langflow-ai, uses OpenSearch as its primary backend for document ingestion and retrieval. Every document processed by the system is chunked, converted into vector embeddings, and written to an OpenSearch index that supports dynamic field mapping and approximate nearest neighbor (ANN) search. Understanding how OpenRAG leverages OpenSearch reveals the architectural foundation for its scalable, multi-modal knowledge base and security model.

OpenSearch as the Persistent Storage Engine

At its core, OpenRAG relies on OpenSearch to store and index all ingested documents. When users upload files, the system splits content into chunks and enriches each chunk with vector embeddings before writing to the index. According to the source code in src/services/search_service.py, the SearchService class orchestrates this process, ensuring that every chunk maintains metadata links to its source document, MIME type, and owner for filtering purposes.

The storage architecture uses OpenSearch's knn_vector field type with the disk-based ANN (disk_ann) method, configured via HNSW parameters defined in config/settings.py. This design allows OpenRAG to maintain fast similarity search capabilities even as corpora scale to millions of chunks.

Hybrid Semantic and Keyword Search

OpenRAG implements a hybrid retrieval strategy that combines semantic similarity with exact keyword matching. In src/services/search_service.py (lines 68-84), the search_tool method constructs a compound query that merges:

KNN (vector) queries for semantic similarity using the chunk_embedding_{model} fields
Multi-match (text) queries for exact keyword hits against the raw text content

OpenSearch evaluates both components simultaneously, returning results that score highly on either semantic relevance or keyword precision. This hybrid approach ensures robust retrieval even when queries contain specific terminology that might not align perfectly with the semantic embedding space.

Dynamic Multi-Model Vector Embeddings

A distinctive feature of OpenRAG is its support for multiple embedding models within a single index. The utility ensure_embedding_field_exists in src/utils/embedding_fields.py (lines 49-66) dynamically creates dedicated knn_vector fields for each model using a naming convention of chunk_embedding_{model}.

These vector fields are configured with tunable HNSW parameters—ef_construction and m—specified in lines 31-39 of the same file. This allows the system to store embeddings from different models (e.g., OpenAI, HuggingFace, or custom embeddings) concurrently, enabling users to select the optimal embedding model for specific queries without reindexing the entire corpus.

Security, Filtering, and Access Control

OpenSearch acts as the security enforcement layer for all data retrieval in OpenRAG. The system implements two levels of protection:

Document-level filtering: The SearchService applies term and terms clauses (lines 71-85 in src/services/search_service.py) to restrict results by data source, MIME type, or document owner before executing the vector search.
Authentication and authorization: The session_manager.py module generates per-user OpenSearch clients that attach JWT or OIDC tokens to every request. OpenSearch's built-in security features then validate these tokens and enforce role-based access control (RBAC) at the cluster level, ensuring users can only access documents they own or have permission to view.

Health Monitoring and Connection Management

Before any indexing or search operations occur, OpenRAG verifies cluster readiness through the wait_for_opensearch utility in src/utils/opensearch_utils.py (lines 11-45). This function implements exponential backoff to ping the cluster, check health status (green/yellow/red), and log readiness states.

The global client configuration in config/settings.py centralizes connection parameters including hosts, ports, and KNN tuning variables (KNN_EF_CONSTRUCTION, KNN_M), ensuring consistent client initialization across the asynchronous AsyncOpenSearch clients used throughout the codebase.

Code Implementation Examples

The following examples demonstrate how OpenRAG interacts with OpenSearch for common operations.

Initializing the OpenSearch Client

from opensearchpy import AsyncOpenSearch
from config.settings import OPENSEARCH_HOST, OPENSEARCH_PORT, OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD

async def get_client():
    client = AsyncOpenSearch(
        hosts=[{"host": OPENSEARCH_HOST, "port": OPENSEARCH_PORT}],
        http_auth=(OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD),
        use_ssl=False,                 # Change to True for TLS

        verify_certs=False,            # Set True in production

    )
    return client

This client initialization pattern appears in the global clients object and is used by the session manager to attach JWT tokens for OIDC authentication.

Ensuring Dynamic Embedding Fields Exist

from utils.embedding_fields import ensure_embedding_field_exists

async def prepare_index(opensearch_client, model_name):
    # Guarantees a knn_vector field named `chunk_embedding_<model>`

    field_name = await ensure_embedding_field_exists(
        opensearch_client,
        model_name=model_name,
    )
    return field_name

Source: src/utils/embedding_fields.py (lines 69-78)

Executing Hybrid Search

from services.search_service import SearchService
from auth_context import set_auth_context

async def hybrid_search(query, user_id, jwt_token):
    service = SearchService()
    # Set auth context for user-scoped client

    set_auth_context(user_id, jwt_token)

    # Run the hybrid search tool

    result = await service.search_tool(query)
    return result

The search_tool method builds the hybrid query combining KNN and multi-match logic, as implemented in lines 68-84 and 389-429 of src/services/search_service.py.

Waiting for Cluster Readiness

from utils.opensearch_utils import wait_for_opensearch

async def startup():
    client = await get_client()
    await wait_for_opensearch(client)   # raises OpenSearchNotReadyError on failure

Source: src/utils/opensearch_utils.py (lines 11-45)

Summary

OpenSearch provides the persistent storage layer for all chunked documents and their vector embeddings in OpenRAG.
Hybrid search capabilities combine KNN vector similarity with traditional keyword matching via multi_match queries in src/services/search_service.py.
Multi-model support is achieved through dynamically created knn_vector fields (chunk_embedding_{model}) managed by src/utils/embedding_fields.py.
Security enforcement happens at the OpenSearch layer through JWT/OIDC authentication and term-based filtering.
Health monitoring via wait_for_opensearch ensures system stability before processing begins.

Frequently Asked Questions

How does OpenRAG handle multiple embedding models in a single OpenSearch index?

OpenRAG dynamically creates separate knn_vector fields for each embedding model using the pattern chunk_embedding_{model}. The ensure_embedding_field_exists function in src/utils/embedding_fields.py checks the index mapping and creates the field with appropriate HNSW parameters if it does not exist, allowing multiple models to coexist without reindexing existing data.

What type of search does OpenRAG perform using OpenSearch?

OpenRAG performs hybrid search that combines approximate nearest neighbor (ANN) vector search with Boolean text matching. The system constructs queries that execute KNN searches for semantic similarity while simultaneously running multi_match queries for keyword precision, merging results to maximize recall and relevance.

How does OpenRAG secure OpenSearch queries?

Security is implemented through JWT/OIDC token propagation and index-level filtering. The session_manager.py module creates per-user clients that attach authentication tokens to requests, while src/services/search_service.py injects term filters (by owner, MIME type, or data source) into every query to enforce document-level access control before results return to the user.

What happens if the OpenSearch cluster becomes unavailable?

The wait_for_opensearch utility in src/utils/opensearch_utils.py implements health checks with exponential backoff during application startup. If the cluster fails readiness checks, it raises an OpenSearchNotReadyError, preventing the application from serving requests until the storage backend recovers, ensuring data consistency and preventing failed indexing operations.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how langflow-ai/openrag works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →