What Is the Purpose of OpenSearch in OpenRAG?
OpenSearch serves as the persistent storage layer, hybrid search engine, and security gatekeeper for OpenRAG, enabling both vector-based semantic retrieval and traditional keyword search while managing multi-model embeddings and enforcing access controls.
OpenRAG, the open-source retrieval-augmented generation framework maintained by langflow-ai, uses OpenSearch as its primary backend for document ingestion and retrieval. Every document processed by the system is chunked, converted into vector embeddings, and written to an OpenSearch index that supports dynamic field mapping and approximate nearest neighbor (ANN) search. Understanding how OpenRAG leverages OpenSearch reveals the architectural foundation for its scalable, multi-modal knowledge base and security model.
OpenSearch as the Persistent Storage Engine
At its core, OpenRAG relies on OpenSearch to store and index all ingested documents. When users upload files, the system splits content into chunks and enriches each chunk with vector embeddings before writing to the index. According to the source code in src/services/search_service.py, the SearchService class orchestrates this process, ensuring that every chunk maintains metadata links to its source document, MIME type, and owner for filtering purposes.
The storage architecture uses OpenSearch's knn_vector field type with the disk-based ANN (disk_ann) method, configured via HNSW parameters defined in config/settings.py. This design allows OpenRAG to maintain fast similarity search capabilities even as corpora scale to millions of chunks.
Hybrid Semantic and Keyword Search
OpenRAG implements a hybrid retrieval strategy that combines semantic similarity with exact keyword matching. In src/services/search_service.py (lines 68-84), the search_tool method constructs a compound query that merges:
- KNN (vector) queries for semantic similarity using the
chunk_embedding_{model}fields - Multi-match (text) queries for exact keyword hits against the raw text content
OpenSearch evaluates both components simultaneously, returning results that score highly on either semantic relevance or keyword precision. This hybrid approach ensures robust retrieval even when queries contain specific terminology that might not align perfectly with the semantic embedding space.
Dynamic Multi-Model Vector Embeddings
A distinctive feature of OpenRAG is its support for multiple embedding models within a single index. The utility ensure_embedding_field_exists in src/utils/embedding_fields.py (lines 49-66) dynamically creates dedicated knn_vector fields for each model using a naming convention of chunk_embedding_{model}.
These vector fields are configured with tunable HNSW parameters—ef_construction and m—specified in lines 31-39 of the same file. This allows the system to store embeddings from different models (e.g., OpenAI, HuggingFace, or custom embeddings) concurrently, enabling users to select the optimal embedding model for specific queries without reindexing the entire corpus.
Security, Filtering, and Access Control
OpenSearch acts as the security enforcement layer for all data retrieval in OpenRAG. The system implements two levels of protection:
- Document-level filtering: The
SearchServiceapplies term and terms clauses (lines 71-85 insrc/services/search_service.py) to restrict results by data source, MIME type, or document owner before executing the vector search. - Authentication and authorization: The
session_manager.pymodule generates per-user OpenSearch clients that attach JWT or OIDC tokens to every request. OpenSearch's built-in security features then validate these tokens and enforce role-based access control (RBAC) at the cluster level, ensuring users can only access documents they own or have permission to view.
Health Monitoring and Connection Management
Before any indexing or search operations occur, OpenRAG verifies cluster readiness through the wait_for_opensearch utility in src/utils/opensearch_utils.py (lines 11-45). This function implements exponential backoff to ping the cluster, check health status (green/yellow/red), and log readiness states.
The global client configuration in config/settings.py centralizes connection parameters including hosts, ports, and KNN tuning variables (KNN_EF_CONSTRUCTION, KNN_M), ensuring consistent client initialization across the asynchronous AsyncOpenSearch clients used throughout the codebase.
Code Implementation Examples
The following examples demonstrate how OpenRAG interacts with OpenSearch for common operations.
Initializing the OpenSearch Client
from opensearchpy import AsyncOpenSearch
from config.settings import OPENSEARCH_HOST, OPENSEARCH_PORT, OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD
async def get_client():
client = AsyncOpenSearch(
hosts=[{"host": OPENSEARCH_HOST, "port": OPENSEARCH_PORT}],
http_auth=(OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD),
use_ssl=False, # Change to True for TLS
verify_certs=False, # Set True in production
)
return client
This client initialization pattern appears in the global clients object and is used by the session manager to attach JWT tokens for OIDC authentication.
Ensuring Dynamic Embedding Fields Exist
from utils.embedding_fields import ensure_embedding_field_exists
async def prepare_index(opensearch_client, model_name):
# Guarantees a knn_vector field named `chunk_embedding_<model>`
field_name = await ensure_embedding_field_exists(
opensearch_client,
model_name=model_name,
)
return field_name
Source: src/utils/embedding_fields.py (lines 69-78)
Executing Hybrid Search
from services.search_service import SearchService
from auth_context import set_auth_context
async def hybrid_search(query, user_id, jwt_token):
service = SearchService()
# Set auth context for user-scoped client
set_auth_context(user_id, jwt_token)
# Run the hybrid search tool
result = await service.search_tool(query)
return result
The search_tool method builds the hybrid query combining KNN and multi-match logic, as implemented in lines 68-84 and 389-429 of src/services/search_service.py.
Waiting for Cluster Readiness
from utils.opensearch_utils import wait_for_opensearch
async def startup():
client = await get_client()
await wait_for_opensearch(client) # raises OpenSearchNotReadyError on failure
Source: src/utils/opensearch_utils.py (lines 11-45)
Summary
- OpenSearch provides the persistent storage layer for all chunked documents and their vector embeddings in OpenRAG.
- Hybrid search capabilities combine KNN vector similarity with traditional keyword matching via
multi_matchqueries insrc/services/search_service.py. - Multi-model support is achieved through dynamically created
knn_vectorfields (chunk_embedding_{model}) managed bysrc/utils/embedding_fields.py. - Security enforcement happens at the OpenSearch layer through JWT/OIDC authentication and term-based filtering.
- Health monitoring via
wait_for_opensearchensures system stability before processing begins.
Frequently Asked Questions
How does OpenRAG handle multiple embedding models in a single OpenSearch index?
OpenRAG dynamically creates separate knn_vector fields for each embedding model using the pattern chunk_embedding_{model}. The ensure_embedding_field_exists function in src/utils/embedding_fields.py checks the index mapping and creates the field with appropriate HNSW parameters if it does not exist, allowing multiple models to coexist without reindexing existing data.
What type of search does OpenRAG perform using OpenSearch?
OpenRAG performs hybrid search that combines approximate nearest neighbor (ANN) vector search with Boolean text matching. The system constructs queries that execute KNN searches for semantic similarity while simultaneously running multi_match queries for keyword precision, merging results to maximize recall and relevance.
How does OpenRAG secure OpenSearch queries?
Security is implemented through JWT/OIDC token propagation and index-level filtering. The session_manager.py module creates per-user clients that attach authentication tokens to requests, while src/services/search_service.py injects term filters (by owner, MIME type, or data source) into every query to enforce document-level access control before results return to the user.
What happens if the OpenSearch cluster becomes unavailable?
The wait_for_opensearch utility in src/utils/opensearch_utils.py implements health checks with exponential backoff during application startup. If the cluster fails readiness checks, it raises an OpenSearchNotReadyError, preventing the application from serving requests until the storage backend recovers, ensuring data consistency and preventing failed indexing operations.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →