api-reference

OpenRAG API Endpoints for Chat Operations: Complete Developer Guide

March 13, 2026 langflow-ai/openrag ↗

OpenRAG exposes nine HTTP endpoints for chat operations across two authentication tiers: a public v1 API (/v1/chat/*) protected by API keys and internal routes (/chat, /langflow) used by the UI, all implemented in src/api/v1/chat.py and src/api/chat.py.

The langflow-ai/openrag repository implements a dual-layer chat system built on FastAPI. These OpenRAG API endpoints for chat operations enable both programmatic integration via API keys and internal service communication, supporting streaming Server-Sent Events (SSE), conversation persistence, and granular retrieval-augmented generation (RAG) controls.

Public v1 Chat API (API-Key Authenticated)

The version-1 API in src/api/v1/chat.py provides the primary external interface for chat operations. All four endpoints require API-key authentication via the get_api_key_user_async dependency and return structured JSON responses.

Create Chat Request (POST /v1/chat)

The chat_create_endpoint function handles new conversation initiation. It accepts a JSON payload containing the user message, optional streaming flags, and search filters, then returns the assistant's response along with source citations.

When stream=true, the endpoint invokes _transform_stream_to_sse to normalize Langflow chunks into Server-Sent Events. When stream=false, it returns a complete JSON payload containing the response, chat_id, and sources array.

This endpoint sets per-request authentication context using auth_context.set_search_filters and auth_context.set_user_jwt before calling chat_service.langflow_chat.

List Conversations (GET /v1/chat)

The chat_list_endpoint retrieves metadata for all conversations belonging to the authenticated user. It calls chat_service.get_langflow_history and returns conversation titles, creation timestamps, and message counts without exposing full message content.

Retrieve Single Conversation (GET /v1/chat/{chat_id})

The chat_get_endpoint fetches complete message history for a specific chat_id, including token usage statistics for each assistant message. It searches the user's history and formats messages with roles, timestamps, and usage data.

Delete Conversation (DELETE /v1/chat/{chat_id})

The chat_delete_endpoint removes a specific conversation by delegating to chat_service.delete_session. It returns a success flag confirming deletion.

Internal Chat Routes (UI and Service Integration)

The internal routes in src/api/chat.py support the OpenRAG web interface and internal microservices. These endpoints use get_current_user for authentication and provide both Langflow-specific and generic chat capabilities.

General Chat Endpoint (POST /chat)

The chat_endpoint function serves as the primary internal chat interface. It validates incoming payloads, applies search filters (retrieval limit, score threshold), and determines whether to return a JSON response or stream raw chunks directly based on the stream parameter.

Unlike the v1 endpoint, this route streams raw Langflow chunks without SSE transformation when streaming is enabled.

Langflow-Specific Endpoint (POST /langflow)

The langflow_endpoint explicitly routes requests to chat_service.langflow_chat, mirroring the v1 chat creation logic but for internal consumers. This endpoint ensures UI components interact with the Langflow backend consistently.

History and Session Management

Two endpoints provide raw history retrieval:

chat_history_endpoint (GET /chat/history): Returns internal-format chat history via chat_service.get_chat_history
langflow_history_endpoint (GET /langflow/history): Returns Langflow-formatted history via chat_service.get_langflow_history

The delete_session_endpoint (DELETE /session/{session_id}) handles session cleanup internally, calling the same chat_service.delete_session method used by the v1 API.

Architecture and Implementation Details

Dependency Injection Pattern

All endpoints utilize FastAPI's Depends system for clean dependency management. The get_chat_service factory injects the chat service instance, while get_api_key_user_async (v1) and get_current_user (internal) extract authentication credentials from request headers.

Authentication Context Propagation

Before invoking chat logic, endpoints establish a thread-local authentication context through auth_context.py. This stores per-request search parameters (filters, limit, score_threshold) and the user JWT, ensuring downstream retrieval operations respect caller-specific constraints without passing parameters through every function call.

Streaming Response Handling

The v1 API normalizes streaming responses through _transform_stream_to_sse in src/api/v1/chat.py, converting Langflow chunks into standardized SSE events with content deltas and source citations. Internal endpoints stream raw chunks directly for lower latency in UI consumption.

Error Handling Strategy

Each endpoint wraps service calls in try-catch blocks that catch generic Exception, log stack traces via logging_config.py, and return HTTP 500 JSON responses. This pattern ensures API stability even when underlying Langflow services fail.

Practical Usage Examples

Non-Streaming Chat Request

Create a new conversation with RAG filters and receive a complete response:

curl -X POST https://api.example.com/v1/chat \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
        "message": "Explain Retrieval-Augmented Generation.",
        "stream": false,
        "filters": {"category": "ml"},
        "limit": 5,
        "score_threshold": 0.7
      }'

Response:

{
  "response": "Retrieval-Augmented Generation (RAG) combines ...",
  "chat_id": "c4f2e9d1-...",
  "sources": [
    {"filename":"paper.pdf","text":"...","score":0.92,"page":12}
  ]
}

Streaming Chat with Server-Sent Events

Enable streaming for real-time token delivery:

curl -N -X POST https://api.example.com/v1/chat \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"message":"What is OpenRAG?","stream":true}'

The response delivers SSE events:


data: {"type":"content","delta":"OpenRAG is "}
data: {"type":"content","delta":"an open-source"}
...
data: {"type":"sources","sources":[{"filename":"doc.pdf","text":"..."}]}
data: {"type":"done","chat_id":"12345"}

List User Conversations

Retrieve conversation metadata without message content:

curl -X GET https://api.example.com/v1/chat \
  -H "Authorization: Bearer <API_KEY>"

{
  "conversations": [
    {"chat_id":"12345","title":"RAG Overview","created_at":"2024-09-01T12:00:00Z","last_activity":"2024-09-01T12:05:00Z","message_count":3},
    {"chat_id":"67890","title":"Langflow Integration","created_at":"2024-09-02T08:30:00Z","last_activity":"2024-09-02T08:45:00Z","message_count":5}
  ]
}

Retrieve Complete Conversation History

Fetch a specific conversation with full message details and token usage:

curl -X GET https://api.example.com/v1/chat/12345 \
  -H "Authorization: Bearer <API_KEY>"

{
  "chat_id":"12345",
  "title":"RAG Overview",
  "created_at":"2024-09-01T12:00:00Z",
  "last_activity":"2024-09-01T12:05:00Z",
  "messages":[
    {"role":"user","content":"What is RAG?","timestamp":"2024-09-01T12:00:00Z"},
    {"role":"assistant","content":"RAG stands for ...","timestamp":"2024-09-01T12:01:10Z","usage":{"prompt_tokens":34,"completion_tokens":56}}
  ]
}

Delete a Conversation

Remove a conversation permanently:

curl -X DELETE https://api.example.com/v1/chat/12345 \
  -H "Authorization: Bearer <API_KEY>"

{"success":true}

Summary

OpenRAG provides nine chat endpoints split between the public v1 API (src/api/v1/chat.py) and internal routes (src/api/chat.py).
Authentication differs by tier: API keys for v1 endpoints, internal session auth for UI routes.
Streaming support: v1 endpoints normalize Langflow streams to SSE format; internal endpoints stream raw chunks.
Conversation lifecycle: Endpoints support creation, listing, detailed retrieval, and deletion through chat_service methods like langflow_chat and delete_session.
RAG controls: All chat endpoints accept filters, limit, and score_threshold parameters, stored in thread-local auth context for downstream retrieval.

Frequently Asked Questions

What is the difference between the v1 API and internal chat endpoints?

The v1 API (/v1/chat/*) in src/api/v1/chat.py is designed for external integrations and requires API-key authentication via get_api_key_user_async. It returns normalized SSE streams and structured JSON responses. Internal endpoints (/chat, /langflow) in src/api/chat.py use get_current_user for UI authentication, stream raw Langflow chunks directly, and provide additional history routes for interface components.

How does streaming work in OpenRAG chat operations?

When the stream parameter is true, the v1 chat_create_endpoint invokes _transform_stream_to_sse to convert Langflow chunks into standardized Server-Sent Events containing content deltas and source citations. Internal endpoints skip normalization and stream raw chunks directly to the UI, reducing latency but requiring client-side handling of the Langflow format.

Which endpoints support conversation history retrieval?

The v1 API provides GET /v1/chat for metadata-only listing and GET /v1/chat/{chat_id} for full message history with token usage. Internal routes offer GET /chat/history for raw format history and GET /langflow/history for Langflow-specific formatting, both calling their respective chat_service methods.

How are RAG search parameters handled across endpoints?

All chat endpoints accept filters, limit, and score_threshold in the request body. These parameters are stored in a thread-local auth context via auth_context.set_search_filters before the chat service executes retrieval operations, ensuring document search respects the caller's constraints without modifying global state.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how langflow-ai/openrag works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →