How to Configure OAuth for OpenRAG: Complete Setup for Google Drive and Microsoft Graph

OpenRAG configures OAuth through declarative environment variables managed by the EnvManager, which validates credentials and injects them into connector classes for Google Drive and Microsoft Graph authentication.

Configuring OAuth for the langflow-ai/openrag repository enables secure authentication with external document sources like Google Drive, OneDrive, and SharePoint. The system uses a centralized EnvManager to load and validate OAuth credentials from environment variables, ensuring that client secrets and tokens flow securely to the appropriate connector classes without hardcoded values.

OAuth Configuration Architecture in OpenRAG

The OpenRAG codebase implements a layered validation system that separates credential storage from connector implementation. At startup, the EnvManager class (located in src/tui/managers/env_manager.py) reads a .env file from the default path ~/.openrag/.env and populates an EnvConfig dataclass.

The environment variable mapping occurs in EnvManager._env_attr_map() (lines 81‑86), which defines the relationship between external OAuth providers and internal configuration attributes. For Google authentication, the system validates the client ID using validate_google_oauth_client_id() (lines 87‑92), ensuring the value ends with .apps.googleusercontent.com before accepting it.

Connector classes consume these values through class-level constants. In src/connectors/google_drive/connector.py (lines 69‑72), the GoogleDriveConnector defines CLIENT_ID_ENV_VAR and CLIENT_SECRET_ENV_VAR, which the AuthService queries during the OAuth initialization flow.

Required Environment Variables for OpenRAG OAuth

OpenRAG requires distinct credential pairs for each cloud provider. The EnvManager validates these immediately upon loading, throwing a RuntimeError if any required value is missing or malformed.

  • Google Drive and UI Authentication: GOOGLE_OAUTH_CLIENT_ID and GOOGLE_OAUTH_CLIENT_SECRET
  • Microsoft Graph (OneDrive/SharePoint): MICROSOFT_GRAPH_OAUTH_CLIENT_ID and MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET
  • Webhook Configuration (Optional): WEBHOOK_BASE_URL — required only when exposing a public callback endpoint for OAuth redirects

The Google client ID undergoes strict format validation in src/tui/utils/validation.py, while Microsoft credentials use generic non-empty validation. Missing values trigger early failure in AuthService.init_oauth() (lines 31‑38) with descriptive error messages.

Step-by-Step Google Drive OAuth Setup

1. Create Google Cloud Credentials

Navigate to the Google Cloud Console → APIs & ServicesCredentialsCreate OAuth client ID (Web application type). Configure the Authorized redirect URIs to match your OpenRAG instance, typically http://localhost:3000/api/oauth/google/callback for local development or https://your-domain.com/api/oauth/google/callback for production.

Copy the generated Client ID and Client secret.

2. Configure the Environment File

Create or edit ~/.openrag/.env with the following values:


# OpenRAG Google OAuth configuration

GOOGLE_OAUTH_CLIENT_ID=1234567890-abcdefghijklmnopqrstuvwxyz.apps.googleusercontent.com
GOOGLE_OAUTH_CLIENT_SECRET=YOUR_GOOGLE_CLIENT_SECRET

Restart the OpenRAG server to trigger EnvManager reloading. The system will warn you if the client ID fails the .apps.googleusercontent.com suffix check.

3. Initiate the Connection Flow

When users click Add Google Drive in the TUI, AuthService.init_oauth() extracts the environment variable names from GoogleDriveConnector.CLIENT_ID_ENV_VAR and reads the actual values using os.getenv(). The service builds an OAuth configuration containing the client ID, scopes, and redirect URI, then generates an authorization URL via GoogleDriveOAuth.create_authorization_url().

After the user grants consent, Google redirects to your callback endpoint with an authorization code. AuthService.handle_oauth_callback() validates the code against reuse and delegates token exchange to the connector's OAuth wrapper. Tokens persist automatically to data/google_drive_<uuid>.json for subsequent API calls.

Configuring Microsoft Graph OAuth for OneDrive and SharePoint

Microsoft Graph authentication follows an identical pattern but uses Azure AD app registrations.

Register a new application in Azure AD → App registrationsNew registration. Add a Redirect URI matching your OpenRAG deployment (e.g., http://localhost:3000/api/oauth/microsoft/callback). Grant delegated permissions for Files.Read and Files.Read.All under API permissions, then create a client secret under Certificates & secrets.

Add these credentials to your .env file:


# Microsoft Graph OAuth configuration

MICROSOFT_GRAPH_OAUTH_CLIENT_ID=YOUR_AZURE_APP_ID
MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET=YOUR_AZURE_CLIENT_SECRET

The OneDriveOAuth and SharePointOAuth classes (located in src/connectors/onedrive/oauth.py and src/connectors/sharepoint/oauth.py) read these variables when AuthService initializes a connection for those connector types.

How the OAuth Token Flow Works in OpenRAG

Understanding the internal flow helps debug connection failures. The system implements a seven-stage pipeline:

  1. Environment Loading: EnvManager builds EnvConfig from ~/.openrag/.env and validates formats.
  2. Flow Initialization: AuthService.init_oauth() receives a connector type (e.g., "google_drive"), retrieves environment variable names from the connector class, and fetches values via os.getenv() (lines 31‑34).
  3. Config Construction: The service assembles an OAuth config dict containing endpoints, scopes, and the redirect_uri (optionally prefixed with WEBHOOK_BASE_URL).
  4. Authorization URL Generation: The concrete OAuth wrapper (e.g., GoogleDriveOAuth) generates the URL for the user to visit.
  5. User Consent: The external provider authenticates the user and redirects to the callback with code and state parameters.
  6. Callback Handling: AuthService.handle_oauth_callback() receives the code, validates state to prevent replay attacks, and calls handle_authorization_callback() on the OAuth wrapper to exchange the code for tokens.
  7. Persistence: The wrapper stores access and refresh tokens in JSON format under the data/ directory, enabling automatic token refresh during subsequent sync operations.

Programmatic OAuth Implementation

For custom integrations or headless deployments, you can trigger OAuth flows programmatically using the AuthService directly.

Initialize Google Drive OAuth via Python

from src.services.auth_service import AuthService
from src.api.session_manager import SessionManager
import asyncio

async def start_google_drive_oauth():
    session_mgr = SessionManager()
    auth_service = AuthService(session_mgr)
    
    # Must match the URI registered in Google Cloud Console

    redirect_uri = "http://localhost:3000/api/oauth/google/callback"
    
    result = await auth_service.init_oauth(
        connector_type="google_drive",
        purpose="data_source",
        connection_name="Production Drive",
        redirect_uri=redirect_uri,
        user_id="admin"
    )
    
    oauth_config = result["oauth_config"]
    auth_url = (
        f"{oauth_config['authorization_endpoint']}?client_id={oauth_config['client_id']}"
        f"&redirect_uri={oauth_config['redirect_uri']}&response_type=code"
        f"&scope={' '.join(oauth_config['scopes'])}&access_type=offline&prompt=consent"
    )
    print(f"Visit this URL to authorize: {auth_url}")
    # Store result["connection_id"] to correlate with the callback

Handle OAuth Callback with FastAPI

from fastapi import FastAPI, Request
from src.services.auth_service import AuthService

app = FastAPI()
auth_service = AuthService(...)

@app.get("/api/oauth/google/callback")
async def google_callback(request: Request):
    code = request.query_params.get("code")
    state = request.query_params.get("state")
    connection_id = request.session.get("pending_connection_id")
    
    result = await auth_service.handle_oauth_callback(
        connection_id=connection_id,
        authorization_code=code,
        state=state
    )
    return {"status": "authenticated", "connection_id": connection_id}

Authenticated Connector Usage

Once tokens are persisted, the connector loads them automatically:

from src.connectors.google_drive.connector import GoogleDriveConnector

cfg = {
    "token_file": "data/google_drive_data_source_abcd.json",
    "recursive": True
}
gdrive = GoogleDriveConnector(cfg)

await gdrive.oauth.load_credentials()
if not await gdrive.oauth.is_authenticated():
    raise RuntimeError("OAuth flow incomplete")

files = await gdrive.list_files()

Summary

  • Declarative Configuration: OpenRAG uses ~/.openrag/.env and EnvManager to load OAuth credentials without code changes.
  • Validation Layer: Google client IDs must end with .apps.googleusercontent.com; Microsoft IDs require non-empty validation.
  • AuthService Orchestration: The init_oauth() and handle_oauth_callback() methods in src/services/auth_service.py manage the full handshake flow.
  • Token Persistence: Connectors store refresh tokens in data/ JSON files, enabling long-lived connections without repeated user consent.
  • Multi-Provider Support: Identical patterns support Google Drive, OneDrive, and SharePoint through environment-specific variable prefixes.

Frequently Asked Questions

Where does OpenRAG store OAuth tokens after authentication?

OpenRAG persists tokens in JSON files under the data/ directory (e.g., data/google_drive_<uuid>.json). The connector-specific OAuth wrappers (such as GoogleDriveOAuth) handle automatic token refresh using these files, ensuring continuous synchronization without requiring users to re-authenticate when access tokens expire.

Why does my Google OAuth configuration fail validation?

The validate_google_oauth_client_id() function in src/tui/utils/validation.py (lines 87‑92) checks that your GOOGLE_OAUTH_CLIENT_ID ends with .apps.googleusercontent.com. If this suffix is missing, the EnvManager flags the configuration as invalid during startup, preventing connection attempts that would fail at Google's authorization endpoint.

Can I use the same Google OAuth credentials for UI login and Google Drive access?

Yes. OpenRAG uses the same GOOGLE_OAUTH_CLIENT_ID and GOOGLE_OAUTH_CLIENT_SECRET environment variables for both protecting the web UI with Google Sign-In and authenticating Google Drive data sources. The AuthService distinguishes the purpose based on the purpose parameter passed to init_oauth(), but both flows consume identical credential pairs from the environment.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →