How to Manage Connectors Using the OpenRAG API: Complete Developer Guide

You manage connectors in OpenRAG by interacting with REST endpoints that handle connection persistence, OAuth authentication, file synchronization, and webhook processing for external storage providers like Google Drive and SharePoint.

The OpenRAG API provides a complete lifecycle management system for external storage connectors. According to the langflow-ai/openrag source code, the platform abstracts cloud storage services into configurable connections (persisted credentials and settings) and in-memory connector instances that execute remote operations.

Understanding the OpenRAG Connector Architecture

OpenRAG implements a layered architecture that separates storage configuration from execution logic.

Core Components

BaseConnector (src/connectors/base.py) defines the abstract contract that every storage provider must implement. This class requires concrete implementations of authenticate(), list_files(), get_file_content(), and setup_subscription():

class BaseConnector(ABC):
    @abstractmethod
    async def authenticate(self) -> bool: ...

ConnectionManager (src/connectors/connection_manager.py) persists configuration objects to data/connections.json and enforces the constraint of one active connection per provider per user. It lazily instantiates concrete connector classes via _create_connector and caches authenticated instances in active_connectors.

ConnectorService (src/connectors/service.py) orchestrates synchronization workflows. It receives connection IDs, retrieves live connector instances, enumerates files, and delegates indexing work to TaskService and DocumentService while updating metadata like ACLs and source URLs.

Listing Available Connector Types

Before creating a connection, discover which storage providers the system supports. The list_connectors endpoint in src/api/connectors.py returns metadata including names, descriptions, and icons:

curl -X GET "<BASE_URL>/connectors" \
  -H "Authorization: Bearer <JWT>"

This endpoint invokes ConnectionManager.get_available_connector_types() (lines 93–101) to return the registry of supported connectors such as Google Drive, OneDrive, and SharePoint.

Creating and Authenticating Connections

Connection creation involves persisting OAuth tokens and webhook URLs. While the analysis notes this is handled by ConnectionManager.create_connection and update_connection (lines 92–108 in src/connectors/connection_manager.py), you typically interact with this via provider-specific endpoints:

curl -X POST "<BASE_URL>/connections/google_drive" \
  -H "Authorization: Bearer <JWT>" \
  -H "Content-Type: application/json" \
  -d '{
        "name": "My GDrive",
        "config": {
          "token_file": "/tmp/gdrive_token.json",
          "webhook_url": "<BASE_URL>/connectors/google_drive/webhook"
        }
      }'

When a request requires a live connector, ConnectionManager.get_connector (lines 97–130) builds the concrete class (e.g., GoogleDriveConnector) and calls its authenticate() method, caching the result for subsequent operations.

Checking Connector Status

Verify connection health by forcing a real-time authentication check against the remote service. The connector_status function in src/api/connectors.py (lines 68–88) iterates over a user’s connections and returns per-connection status including authentication state and base URLs:

curl -X GET "<BASE_URL>/connectors/google_drive/status" \
  -H "Authorization: Bearer <JWT>"

Synchronizing Files

Trigger file ingestion via the connector_sync endpoint in src/api/connectors.py (lines 108–162). The system selects the first working connection and supports two synchronization modes:

  • Full sync: Processes all known files via ConnectorService.sync_connector_files
  • Specific files: Processes selected items via ConnectorService.sync_specific_files

Sync all indexed files:

curl -X POST "<BASE_URL>/connectors/google_drive/sync" \
  -H "Authorization: Bearer <JWT>" \
  -H "Content-Type: application/json" \
  -d '{}'

Sync specific files from a picker UI:

curl -X POST "<BASE_URL>/connectors/google_drive/sync" \
  -H "Authorization: Bearer <JWT>" \
  -H "Content-Type: application/json" \
  -d '{
        "selected_files": [{"id":"file123","name":"Report.pdf"}]
      }'

Handling Webhook Notifications

For incremental updates, OpenRAG receives push notifications from cloud providers. The connector_webhook endpoint in src/api/connectors.py (lines 52–84) validates incoming requests, maps webhook IDs to stored connections, and triggers sync_specific_files for affected file IDs:

curl -X POST "<BASE_URL>/connectors/google_drive/webhook" \
  -H "Content-Type: application/json" \
  -d '{"change":"updated","resourceId":"file123"}'

This endpoint is normally invoked by the remote provider rather than client applications.

Retrieving Access Tokens and Disconnecting

Token Retrieval: Frontend picker dialogs require short-lived OAuth tokens. The connector_token function (lines 57–84 in src/api/connectors.py) returns provider-specific tokens and can request SharePoint-scoped tokens when required:

curl -X GET "<BASE_URL>/connectors/google_drive/token?connection_id=<CONN_ID>" \
  -H "Authorization: Bearer <JWT>"

Disconnection: Revoke access and clean up resources using connector_disconnect (lines 28–46), which deletes the persisted record, removes webhook subscriptions, and clears cached instances:

curl -X DELETE "<BASE_URL>/connectors/google_drive/disconnect" \
  -H "Authorization: Bearer <JWT>"

Summary

  • Discover supported storage providers via GET /connectors, which calls ConnectionManager.get_available_connector_types().
  • Persist credentials through ConnectionManager.create_connection, storing configurations in data/connections.json.
  • Authenticate lazily via ConnectionManager.get_connector, which instantiates concrete classes like GoogleDriveConnector and caches active sessions.
  • Monitor health through GET /connectors/{type}/status, which invokes real authenticate() checks against remote APIs.
  • Synchronize content via POST /connectors/{type}/sync, supporting both full catalog scans and specific file selections through ConnectorService.
  • Receive incremental updates via POST /connectors/{type}/webhook, which routes cloud provider notifications to targeted file syncs.
  • Revoke access via DELETE /connectors/{type}/disconnect, ensuring complete cleanup of tokens and subscriptions.

Frequently Asked Questions

How does OpenRAG store connector credentials?

As implemented in langflow-ai/openrag, the ConnectionManager persists connector configurations as JSON records in data/connections.json. These records contain OAuth tokens, webhook URLs, and provider-specific settings, while the actual connector instances remain ephemeral objects cached in memory during active sessions.

What happens when I trigger a sync operation?

When you call POST /connectors/{type}/sync, the connector_sync function in src/api/connectors.py selects the first working connection for that provider and delegates to ConnectorService. Depending on your payload, it either calls sync_connector_files for a full crawl or sync_specific_files for targeted indexing, ultimately queueing tasks via TaskService and updating document metadata.

How do I add support for a new cloud storage provider?

Extend the BaseConnector abstract class defined in src/connectors/base.py, implementing required methods like authenticate(), list_files(), and setup_subscription(). Register your concrete implementation in the connection manager's factory method so get_connector can instantiate it when users create connections of your new type.

Can I have multiple connections to the same provider?

The ConnectionManager enforces one active connection per provider per user as part of its design in src/connectors/connection_manager.py. While you can create multiple connection records, the system typically selects the first working connection when executing sync operations, and the architecture assumes single tenancy per provider to avoid webhook routing conflicts.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →