How RAGFlow Handles Authentication and Authorization for API Endpoints

RAGFlow secures its REST API using a two-layer mechanism where _load_user in api/apps/__init__.py validates UUID-style tokens from the Authorization header, and the @login_required decorator enforces that only authenticated users access protected routes.

RAGFlow is an open-source retrieval-augmented generation (RAG) engine that exposes a comprehensive REST API for document processing and conversational AI. Understanding how RAGFlow authentication and authorization work is essential for developers integrating with the platform or extending its capabilities. The system implements a token-based security model that distinguishes between temporary session tokens for browser clients and permanent API tokens for programmatic access.

The Two-Layer Security Model

RAGFlow implements a clear separation between verifying identity and enforcing access control:

Layer Component Location Purpose
Authentication _load_user api/apps/__init__.py (lines 94-124) Validates tokens from the Authorization header against the User or APIToken tables
Authorization login_required api/apps/__init__.py (lines 144-180) Decorator that raises QuartAuthUnauthorized if g.user is missing

This architecture ensures that authentication logic is centralized while individual routes can declaratively require authentication using the decorator.

Authentication Implementation

User Login and Session Token Generation

When a user authenticates via the /user/login endpoint defined in api/apps/user_app.py, the system generates a UUID-style access token and stores it in the User table. The relevant logic appears in lines 124-133:


# api/apps/user_app.py – login()

response_data = user.to_json()
user.access_token = get_uuid()
login_user(user)               # saves session data

...
return await construct_response(..., auth=user.get_id())

The construct_response function in common/connection_utils.py (lines 4-15) automatically injects the Authorization header into the JSON response, allowing clients to capture the token for subsequent requests.

API Token Authentication for Programmatic Access

For service-to-service communication, RAGFlow supports permanent API tokens stored in the api_token table. The APIToken model is defined in api/db/db_models.py (lines 80-90) with a composite primary key of tenant_id and token.

When _load_user processes a request, it first attempts to match the header value against the User table. If that fails and the header contains a Bearer prefix, it falls back to the APIToken table (lines 124-132 in api/apps/__init__.py):


# token-lookup fallback

objs = APIToken.query(token=authorization.split()[1])

The _load_user Function

The core authentication logic resides in _load_user within api/apps/__init__.py (lines 94-124). This function:

  1. Extracts the Authorization header from the request
  2. Queries the User table by access_token
  3. Falls back to APIToken.query() for bearer tokens
  4. Populates g.user with the authenticated entity
  5. Logs warnings and returns None on validation failures, triggering a 401 response downstream

Authorization and Route Protection

The @login_required Decorator

Once authentication populates g.user, the @login_required decorator (lines 144-180 in api/apps/__init__.py) enforces authorization. This decorator:

  • Checks if current_user (which invokes _load_user) returns a valid user object
  • Raises QuartAuthUnauthorized if the check fails
  • Allows the route handler to execute only when g.user is present

Protected Endpoint Examples

Protected routes use the decorator declaratively. For example, the token creation endpoint in api/apps/system_app.py (lines 21-23) requires authentication:

@manager.route("/new_token", methods=["POST"])
@login_required
def new_token():

Similarly, tenant queries, conversation APIs, and document management endpoints all employ @login_required to ensure only authenticated users access these resources.

API Token Lifecycle Management

Creating Permanent API Tokens

Programmatic clients generate long-lived tokens via the /system/new_token endpoint. The service layer in api/db/services/api_service.py provides APITokenService with methods including used() (updates update_time on each use) and delete_by_tenant_id() (cleans up tokens). These appear in lines 30-42.

When a token is created, the system stores it in the api_token table with the following structure from api/db/db_models.py (lines 80-90):

  • tenant_id: Identifies the organization/workspace
  • token: The actual token string (part of composite primary key)
  • beta: Additional metadata field
  • create_time/update_time: Timestamps for lifecycle management

Token Storage and Service Layer

The APITokenService class in api/db/services/api_service.py (lines 30-42) handles database operations:

class APITokenService(CommonService):
    @classmethod
    def used(cls, token):
        # Updates the update_time timestamp

        return cls.update_by_id(token.id, {"update_time": datetime.now()})
    
    @classmethod
    def delete_by_tenant_id(cls, tenant_id):
        # Cleanup method for tenant deletion

        return cls.delete({"tenant_id": tenant_id})

Response Handling and Token Injection

All endpoints utilize construct_response() (async) or sync_construct_response() to build JSON payloads. Located in common/connection_utils.py (lines 4-15), this utility automatically injects the Authorization header when issuing new tokens:


# common/connection_utils.py – construct_response()

async def construct_response(...):
    ...
    if auth:
        response.headers["Authorization"] = auth

This ensures clients receive tokens immediately in response headers following standard HTTP authentication patterns.

Summary

  • RAGFlow authentication and authorization relies on a two-layer security model separating identity verification from access control.
  • The _load_user function in api/apps/__init__.py (lines 94-124) validates UUID-style tokens from the Authorization header against either the User table (session tokens) or the APIToken table (programmatic tokens).
  • The @login_required decorator in api/apps/__init__.py (lines 144-180) enforces authorization by raising QuartAuthUnauthorized when g.user is missing.
  • Users obtain short-lived session tokens via /user/login in api/apps/user_app.py, while services obtain permanent API tokens via /system/new_token in api/apps/system_app.py.
  • The APIToken model in api/db/db_models.py and APITokenService in api/db/services/api_service.py manage token lifecycle including usage tracking and cleanup.
  • The construct_response utility in common/connection_utils.py automatically injects Authorization headers into responses when new tokens are issued.

Frequently Asked Questions

How does RAGFlow validate API tokens on incoming requests?

RAGFlow validates tokens through the _load_user function in api/apps/__init__.py (lines 94-124). This function extracts the Authorization header and first attempts to match it against the access_token field in the User table. If no match is found and the header contains a Bearer prefix, it falls back to querying the APIToken table using APIToken.query(token=authorization.split()[1]) as implemented in lines 124-132.

What is the difference between session tokens and API tokens in RAGFlow?

Session tokens are short-lived UUID-style tokens generated during user login via /user/login in api/apps/user_app.py (lines 124-133) and stored in the User table's access_token field. API tokens are long-lived tokens created through /system/new_token for programmatic access, stored in the api_token table with a composite key of tenant_id and token as defined in api/db/db_models.py (lines 80-90).

How do I protect a new API endpoint in RAGFlow?

Protect endpoints by importing and applying the @login_required decorator from api/apps/__init__.py (lines 144-180). This decorator checks that current_user returns a valid user object populated by _load_user. If g.user is missing, it raises QuartAuthUnauthorized, which translates to a 401 JSON error. Example usage appears in api/apps/system_app.py (lines 21-23) where the new_token endpoint is decorated.

Where does RAGFlow inject the Authorization header in responses?

RAGFlow injects the Authorization header through the construct_response() utility in common/connection_utils.py (lines 4-15). When a new token is issued during login or API token creation, the function receives the auth parameter and sets response.headers["Authorization"] = auth before returning the JSON payload. This ensures clients receive tokens immediately in response headers following standard HTTP authentication patterns.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →