# OpenRAG API Endpoints for Chat Operations: Complete Developer Guide

> Explore OpenRAG API endpoints for chat operations. Access the public v1 chat API and internal routes from the Langflow AI repository to integrate chat functionality seamlessly.

- Repository: [Langflow/openrag](https://github.com/langflow-ai/openrag)
- Tags: api-reference
- Published: 2026-03-13

---

**OpenRAG exposes nine HTTP endpoints for chat operations across two authentication tiers: a public v1 API (`/v1/chat/*`) protected by API keys and internal routes (`/chat`, `/langflow`) used by the UI, all implemented in [`src/api/v1/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/v1/chat.py) and [`src/api/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/chat.py).**

The `langflow-ai/openrag` repository implements a dual-layer chat system built on FastAPI. These **OpenRAG API endpoints for chat operations** enable both programmatic integration via API keys and internal service communication, supporting streaming Server-Sent Events (SSE), conversation persistence, and granular retrieval-augmented generation (RAG) controls.

## Public v1 Chat API (API-Key Authenticated)

The version-1 API in [`src/api/v1/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/v1/chat.py) provides the primary external interface for chat operations. All four endpoints require API-key authentication via the `get_api_key_user_async` dependency and return structured JSON responses.

### Create Chat Request (POST /v1/chat)

The `chat_create_endpoint` function handles new conversation initiation. It accepts a JSON payload containing the user message, optional streaming flags, and search filters, then returns the assistant's response along with source citations.

When `stream=true`, the endpoint invokes `_transform_stream_to_sse` to normalize Langflow chunks into Server-Sent Events. When `stream=false`, it returns a complete JSON payload containing the `response`, `chat_id`, and `sources` array.

This endpoint sets per-request authentication context using `auth_context.set_search_filters` and `auth_context.set_user_jwt` before calling `chat_service.langflow_chat`.

### List Conversations (GET /v1/chat)

The `chat_list_endpoint` retrieves metadata for all conversations belonging to the authenticated user. It calls `chat_service.get_langflow_history` and returns conversation titles, creation timestamps, and message counts without exposing full message content.

### Retrieve Single Conversation (GET /v1/chat/{chat_id})

The `chat_get_endpoint` fetches complete message history for a specific `chat_id`, including token usage statistics for each assistant message. It searches the user's history and formats messages with roles, timestamps, and usage data.

### Delete Conversation (DELETE /v1/chat/{chat_id})

The `chat_delete_endpoint` removes a specific conversation by delegating to `chat_service.delete_session`. It returns a success flag confirming deletion.

## Internal Chat Routes (UI and Service Integration)

The internal routes in [`src/api/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/chat.py) support the OpenRAG web interface and internal microservices. These endpoints use `get_current_user` for authentication and provide both Langflow-specific and generic chat capabilities.

### General Chat Endpoint (POST /chat)

The `chat_endpoint` function serves as the primary internal chat interface. It validates incoming payloads, applies search filters (retrieval limit, score threshold), and determines whether to return a JSON response or stream raw chunks directly based on the `stream` parameter.

Unlike the v1 endpoint, this route streams raw Langflow chunks without SSE transformation when streaming is enabled.

### Langflow-Specific Endpoint (POST /langflow)

The `langflow_endpoint` explicitly routes requests to `chat_service.langflow_chat`, mirroring the v1 chat creation logic but for internal consumers. This endpoint ensures UI components interact with the Langflow backend consistently.

### History and Session Management

Two endpoints provide raw history retrieval:

- **`chat_history_endpoint`** (GET /chat/history): Returns internal-format chat history via `chat_service.get_chat_history`
- **`langflow_history_endpoint`** (GET /langflow/history): Returns Langflow-formatted history via `chat_service.get_langflow_history`

The **`delete_session_endpoint`** (DELETE /session/{session_id}) handles session cleanup internally, calling the same `chat_service.delete_session` method used by the v1 API.

## Architecture and Implementation Details

### Dependency Injection Pattern

All endpoints utilize FastAPI's `Depends` system for clean dependency management. The `get_chat_service` factory injects the chat service instance, while `get_api_key_user_async` (v1) and `get_current_user` (internal) extract authentication credentials from request headers.

### Authentication Context Propagation

Before invoking chat logic, endpoints establish a thread-local **authentication context** through [`auth_context.py`](https://github.com/langflow-ai/openrag/blob/main/auth_context.py). This stores per-request search parameters (`filters`, `limit`, `score_threshold`) and the user JWT, ensuring downstream retrieval operations respect caller-specific constraints without passing parameters through every function call.

### Streaming Response Handling

The v1 API normalizes streaming responses through `_transform_stream_to_sse` in [`src/api/v1/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/v1/chat.py), converting Langflow chunks into standardized SSE events with content deltas and source citations. Internal endpoints stream raw chunks directly for lower latency in UI consumption.

### Error Handling Strategy

Each endpoint wraps service calls in try-catch blocks that catch generic `Exception`, log stack traces via [`logging_config.py`](https://github.com/langflow-ai/openrag/blob/main/logging_config.py), and return HTTP 500 JSON responses. This pattern ensures API stability even when underlying Langflow services fail.

## Practical Usage Examples

### Non-Streaming Chat Request

Create a new conversation with RAG filters and receive a complete response:

```bash
curl -X POST https://api.example.com/v1/chat \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
        "message": "Explain Retrieval-Augmented Generation.",
        "stream": false,
        "filters": {"category": "ml"},
        "limit": 5,
        "score_threshold": 0.7
      }'

```

**Response:**

```json
{
  "response": "Retrieval-Augmented Generation (RAG) combines ...",
  "chat_id": "c4f2e9d1-...",
  "sources": [
    {"filename":"paper.pdf","text":"...","score":0.92,"page":12}
  ]
}

```

### Streaming Chat with Server-Sent Events

Enable streaming for real-time token delivery:

```bash
curl -N -X POST https://api.example.com/v1/chat \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"message":"What is OpenRAG?","stream":true}'

```

The response delivers SSE events:

```

data: {"type":"content","delta":"OpenRAG is "}
data: {"type":"content","delta":"an open-source"}
...
data: {"type":"sources","sources":[{"filename":"doc.pdf","text":"..."}]}
data: {"type":"done","chat_id":"12345"}

```

### List User Conversations

Retrieve conversation metadata without message content:

```bash
curl -X GET https://api.example.com/v1/chat \
  -H "Authorization: Bearer <API_KEY>"

```

```json
{
  "conversations": [
    {"chat_id":"12345","title":"RAG Overview","created_at":"2024-09-01T12:00:00Z","last_activity":"2024-09-01T12:05:00Z","message_count":3},
    {"chat_id":"67890","title":"Langflow Integration","created_at":"2024-09-02T08:30:00Z","last_activity":"2024-09-02T08:45:00Z","message_count":5}
  ]
}

```

### Retrieve Complete Conversation History

Fetch a specific conversation with full message details and token usage:

```bash
curl -X GET https://api.example.com/v1/chat/12345 \
  -H "Authorization: Bearer <API_KEY>"

```

```json
{
  "chat_id":"12345",
  "title":"RAG Overview",
  "created_at":"2024-09-01T12:00:00Z",
  "last_activity":"2024-09-01T12:05:00Z",
  "messages":[
    {"role":"user","content":"What is RAG?","timestamp":"2024-09-01T12:00:00Z"},
    {"role":"assistant","content":"RAG stands for ...","timestamp":"2024-09-01T12:01:10Z","usage":{"prompt_tokens":34,"completion_tokens":56}}
  ]
}

```

### Delete a Conversation

Remove a conversation permanently:

```bash
curl -X DELETE https://api.example.com/v1/chat/12345 \
  -H "Authorization: Bearer <API_KEY>"

```

```json
{"success":true}

```

## Summary

- **OpenRAG provides nine chat endpoints** split between the public v1 API ([`src/api/v1/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/v1/chat.py)) and internal routes ([`src/api/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/chat.py)).
- **Authentication differs by tier**: API keys for v1 endpoints, internal session auth for UI routes.
- **Streaming support**: v1 endpoints normalize Langflow streams to SSE format; internal endpoints stream raw chunks.
- **Conversation lifecycle**: Endpoints support creation, listing, detailed retrieval, and deletion through `chat_service` methods like `langflow_chat` and `delete_session`.
- **RAG controls**: All chat endpoints accept `filters`, `limit`, and `score_threshold` parameters, stored in thread-local auth context for downstream retrieval.

## Frequently Asked Questions

### What is the difference between the v1 API and internal chat endpoints?

The **v1 API** (`/v1/chat/*`) in [`src/api/v1/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/v1/chat.py) is designed for external integrations and requires API-key authentication via `get_api_key_user_async`. It returns normalized SSE streams and structured JSON responses. **Internal endpoints** (`/chat`, `/langflow`) in [`src/api/chat.py`](https://github.com/langflow-ai/openrag/blob/main/src/api/chat.py) use `get_current_user` for UI authentication, stream raw Langflow chunks directly, and provide additional history routes for interface components.

### How does streaming work in OpenRAG chat operations?

When the `stream` parameter is `true`, the v1 `chat_create_endpoint` invokes `_transform_stream_to_sse` to convert Langflow chunks into standardized Server-Sent Events containing content deltas and source citations. Internal endpoints skip normalization and stream raw chunks directly to the UI, reducing latency but requiring client-side handling of the Langflow format.

### Which endpoints support conversation history retrieval?

The v1 API provides `GET /v1/chat` for metadata-only listing and `GET /v1/chat/{chat_id}` for full message history with token usage. Internal routes offer `GET /chat/history` for raw format history and `GET /langflow/history` for Langflow-specific formatting, both calling their respective `chat_service` methods.

### How are RAG search parameters handled across endpoints?

All chat endpoints accept `filters`, `limit`, and `score_threshold` in the request body. These parameters are stored in a thread-local **auth context** via `auth_context.set_search_filters` before the chat service executes retrieval operations, ensuring document search respects the caller's constraints without modifying global state.