How RAGFlow Handles Authentication and Authorization for API Endpoints
RAGFlow secures its REST API using a two-layer mechanism where _load_user in api/apps/__init__.py validates UUID-style tokens from the Authorization header, and the @login_required decorator enforces that only authenticated users access protected routes.
RAGFlow is an open-source retrieval-augmented generation (RAG) engine that exposes a comprehensive REST API for document processing and conversational AI. Understanding how RAGFlow authentication and authorization work is essential for developers integrating with the platform or extending its capabilities. The system implements a token-based security model that distinguishes between temporary session tokens for browser clients and permanent API tokens for programmatic access.
The Two-Layer Security Model
RAGFlow implements a clear separation between verifying identity and enforcing access control:
| Layer | Component | Location | Purpose |
|---|---|---|---|
| Authentication | _load_user |
api/apps/__init__.py (lines 94-124) |
Validates tokens from the Authorization header against the User or APIToken tables |
| Authorization | login_required |
api/apps/__init__.py (lines 144-180) |
Decorator that raises QuartAuthUnauthorized if g.user is missing |
This architecture ensures that authentication logic is centralized while individual routes can declaratively require authentication using the decorator.
Authentication Implementation
User Login and Session Token Generation
When a user authenticates via the /user/login endpoint defined in api/apps/user_app.py, the system generates a UUID-style access token and stores it in the User table. The relevant logic appears in lines 124-133:
# api/apps/user_app.py – login()
response_data = user.to_json()
user.access_token = get_uuid()
login_user(user) # saves session data
...
return await construct_response(..., auth=user.get_id())
The construct_response function in common/connection_utils.py (lines 4-15) automatically injects the Authorization header into the JSON response, allowing clients to capture the token for subsequent requests.
API Token Authentication for Programmatic Access
For service-to-service communication, RAGFlow supports permanent API tokens stored in the api_token table. The APIToken model is defined in api/db/db_models.py (lines 80-90) with a composite primary key of tenant_id and token.
When _load_user processes a request, it first attempts to match the header value against the User table. If that fails and the header contains a Bearer prefix, it falls back to the APIToken table (lines 124-132 in api/apps/__init__.py):
# token-lookup fallback
objs = APIToken.query(token=authorization.split()[1])
The _load_user Function
The core authentication logic resides in _load_user within api/apps/__init__.py (lines 94-124). This function:
- Extracts the
Authorizationheader from the request - Queries the
Usertable byaccess_token - Falls back to
APIToken.query()for bearer tokens - Populates
g.userwith the authenticated entity - Logs warnings and returns
Noneon validation failures, triggering a 401 response downstream
Authorization and Route Protection
The @login_required Decorator
Once authentication populates g.user, the @login_required decorator (lines 144-180 in api/apps/__init__.py) enforces authorization. This decorator:
- Checks if
current_user(which invokes_load_user) returns a valid user object - Raises
QuartAuthUnauthorizedif the check fails - Allows the route handler to execute only when
g.useris present
Protected Endpoint Examples
Protected routes use the decorator declaratively. For example, the token creation endpoint in api/apps/system_app.py (lines 21-23) requires authentication:
@manager.route("/new_token", methods=["POST"])
@login_required
def new_token():
…
Similarly, tenant queries, conversation APIs, and document management endpoints all employ @login_required to ensure only authenticated users access these resources.
API Token Lifecycle Management
Creating Permanent API Tokens
Programmatic clients generate long-lived tokens via the /system/new_token endpoint. The service layer in api/db/services/api_service.py provides APITokenService with methods including used() (updates update_time on each use) and delete_by_tenant_id() (cleans up tokens). These appear in lines 30-42.
When a token is created, the system stores it in the api_token table with the following structure from api/db/db_models.py (lines 80-90):
tenant_id: Identifies the organization/workspacetoken: The actual token string (part of composite primary key)beta: Additional metadata fieldcreate_time/update_time: Timestamps for lifecycle management
Token Storage and Service Layer
The APITokenService class in api/db/services/api_service.py (lines 30-42) handles database operations:
class APITokenService(CommonService):
@classmethod
def used(cls, token):
# Updates the update_time timestamp
return cls.update_by_id(token.id, {"update_time": datetime.now()})
@classmethod
def delete_by_tenant_id(cls, tenant_id):
# Cleanup method for tenant deletion
return cls.delete({"tenant_id": tenant_id})
Response Handling and Token Injection
All endpoints utilize construct_response() (async) or sync_construct_response() to build JSON payloads. Located in common/connection_utils.py (lines 4-15), this utility automatically injects the Authorization header when issuing new tokens:
# common/connection_utils.py – construct_response()
async def construct_response(...):
...
if auth:
response.headers["Authorization"] = auth
This ensures clients receive tokens immediately in response headers following standard HTTP authentication patterns.
Summary
- RAGFlow authentication and authorization relies on a two-layer security model separating identity verification from access control.
- The
_load_userfunction inapi/apps/__init__.py(lines 94-124) validates UUID-style tokens from theAuthorizationheader against either theUsertable (session tokens) or theAPITokentable (programmatic tokens). - The
@login_requireddecorator inapi/apps/__init__.py(lines 144-180) enforces authorization by raisingQuartAuthUnauthorizedwheng.useris missing. - Users obtain short-lived session tokens via
/user/logininapi/apps/user_app.py, while services obtain permanent API tokens via/system/new_tokeninapi/apps/system_app.py. - The
APITokenmodel inapi/db/db_models.pyandAPITokenServiceinapi/db/services/api_service.pymanage token lifecycle including usage tracking and cleanup. - The
construct_responseutility incommon/connection_utils.pyautomatically injectsAuthorizationheaders into responses when new tokens are issued.
Frequently Asked Questions
How does RAGFlow validate API tokens on incoming requests?
RAGFlow validates tokens through the _load_user function in api/apps/__init__.py (lines 94-124). This function extracts the Authorization header and first attempts to match it against the access_token field in the User table. If no match is found and the header contains a Bearer prefix, it falls back to querying the APIToken table using APIToken.query(token=authorization.split()[1]) as implemented in lines 124-132.
What is the difference between session tokens and API tokens in RAGFlow?
Session tokens are short-lived UUID-style tokens generated during user login via /user/login in api/apps/user_app.py (lines 124-133) and stored in the User table's access_token field. API tokens are long-lived tokens created through /system/new_token for programmatic access, stored in the api_token table with a composite key of tenant_id and token as defined in api/db/db_models.py (lines 80-90).
How do I protect a new API endpoint in RAGFlow?
Protect endpoints by importing and applying the @login_required decorator from api/apps/__init__.py (lines 144-180). This decorator checks that current_user returns a valid user object populated by _load_user. If g.user is missing, it raises QuartAuthUnauthorized, which translates to a 401 JSON error. Example usage appears in api/apps/system_app.py (lines 21-23) where the new_token endpoint is decorated.
Where does RAGFlow inject the Authorization header in responses?
RAGFlow injects the Authorization header through the construct_response() utility in common/connection_utils.py (lines 4-15). When a new token is issued during login or API token creation, the function receives the auth parameter and sets response.headers["Authorization"] = auth before returning the JSON payload. This ensures clients receive tokens immediately in response headers following standard HTTP authentication patterns.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →