how-to-guide

How to Use the RAGFlow Python SDK to Build Custom RAG Applications

February 23, 2026 infiniflow/ragflow ↗

The RAGFlow Python SDK provides an object-oriented wrapper around the RAGFlow REST API, enabling developers to programmatically manage datasets, parse documents, create chat sessions, and execute retrieval-augmented generation without writing HTTP boilerplate.

The RAGFlow Python SDK, available in the infiniflow/ragflow repository, offers a thin yet expressive interface for building custom RAG applications. It abstracts complex REST endpoints into intuitive domain objects like DataSet, Chat, and Agent, allowing you to construct production-ready retrieval pipelines using native Python syntax. The SDK architecture centers on three layers: the Client (RAGFlow class), Domain Objects (datasets, chats, agents), and Session/Message handlers for inference.

Architecture of the RAGFlow Python SDK

The SDK follows a layered design that maps directly to REST endpoints while hiding implementation details. At the foundation lies the Client layer in sdk/python/ragflow_sdk/ragflow.py, where the RAGFlow class manages API keys, base URLs, and low-level HTTP helpers (post, get, put, delete).

Domain objects such as DataSet, Chat, Agent, and Memory reside in sdk/python/ragflow_sdk/modules/ and inherit from a Base class defined in base.py. This base automatically maps JSON responses to Python attributes and provides convenience methods like to_json. For interactive inference, the Session and Message classes in session.py handle both synchronous and streaming execution.

Initializing the SDK Client

To begin building custom RAG applications, instantiate the RAGFlow client with your API credentials and server endpoint. The constructor is defined in sdk/python/ragflow_sdk/ragflow.py.

from ragflow_sdk import RAGFlow

# Initialize the client

rag = RAGFlow(
    api_key="YOUR_RAGFLOW_API_KEY",
    base_url="http://127.0.0.1",  # Your RAGFlow server address

)

This client instance serves as the entry point for all subsequent operations, including dataset management, chat creation, and agent deployment.

Creating and Managing Datasets

Datasets represent collections of documents that power your retrieval system. The DataSet class in sdk/python/ragflow_sdk/modules/dataset.py provides methods for creation, document upload, and asynchronous parsing.


# Create a dataset with a specific embedding model

dataset = rag.create_dataset(
    name="legal_contracts",
    description="Contract documents for Q&A",
    embedding_model="text-embedding-ada-002"
)

# Upload binary files

files = [
    {"display_name": "contract1.pdf", "blob": open("contract1.pdf", "rb")},
    {"display_name": "contract2.pdf", "blob": open("contract2.pdf", "rb")},
]
documents = dataset.upload_documents(files)

# Trigger async parsing and wait for completion

doc_ids = [doc.id for doc in documents]
dataset.parse_documents(doc_ids)

The upload_documents method returns a list of Document objects, while parse_documents initiates the chunking and embedding process asynchronously.

Building Chat-Based RAG Assistants

For conversational interfaces, the SDK provides the Chat class in sdk/python/ragflow_sdk/modules/chat.py. You associate datasets with a chat to ground responses in your uploaded documents.


# Create a chat assistant linked to your dataset

chat = rag.create_chat(
    name="Legal Assistant",
    dataset_ids=[dataset.id]
)

# Initialize a conversation session

session = chat.create_session(name="User Query Session")

# Execute a non-streaming query

for message in session.ask("What is the termination clause in contract1?"):
    print(message.content)

The Session.ask method in sdk/python/ragflow_sdk/modules/session.py yields Message objects containing the generated response and relevant citations.

Implementing Streaming Responses

For real-time user interfaces, enable streaming to receive tokens as they are generated. Set stream=True when calling Session.ask:


# Stream tokens for interactive UIs

for msg in session.ask("Summarize the liability section.", stream=True):
    print(msg.content, end="", flush=True)

The SDK handles Server-Sent Events (SSE) internally, yielding Message instances as each chunk arrives from the server. This implementation resides in the Session class within session.py.

Deploying Custom Agents with DSL

Beyond simple chats, the RAGFlow Python SDK supports graph-based Agent workflows defined via Domain Specific Language (DSL). The Agent class in sdk/python/ragflow_sdk/modules/agent.py enables complex multi-step reasoning.


# Define a custom workflow using DSL

dsl = {
    "components": {
        "begin": {"downstream": ["Answer"], "obj": {"component_name": "Begin", "params": {}}}
    },
    "graph": {"nodes": [], "edges": []}
}

# Create the agent

agent = rag.create_agent(title="Custom Research Agent", dsl=dsl)

# Start an agent session

agent_session = agent.create_session()
for msg in agent_session.ask("Explain the pricing model.", stream=False):
    print(msg.content)

The DSL structure defines components and their relationships, allowing you to construct sophisticated reasoning chains beyond standard RAG retrieval.

Persisting Long-Term Memory

For applications requiring cross-session context, the SDK provides memory management through create_memory, add_message, and search_message methods implemented in sdk/python/ragflow_sdk/ragflow.py (lines 94-122).


# Create a memory store

memory = rag.create_memory(
    name="conversation_memory",
    memory_type=["embedding"],
    embd_id="text-embedding-ada-002",
    llm_id="gpt-3.5-turbo"
)

# Store interaction context

rag.add_message(
    memory_id=[memory.id],
    agent_id=agent.id,
    session_id=agent_session.id,
    user_input="What are the payment terms?",
    agent_response="The payment terms are 30 days net."
)

# Retrieve relevant historical context

hits = rag.search_message(
    query="payment terms",
    memory_id=[memory.id],
    top_n=5
)

This persistence layer enables semantic search across previous interactions, allowing your custom RAG applications to maintain context across multiple sessions.

Summary

The RAGFlow Python SDK wraps the REST API in sdk/python/ragflow_sdk/ragflow.py, providing a RAGFlow client class that handles authentication and HTTP operations.
Domain objects like DataSet, Chat, and Agent inherit from a Base class in modules/base.py, automatically mapping JSON responses to Python attributes.
Document workflows use create_dataset, upload_documents, and parse_documents to ingest and process files for retrieval.
Conversational AI is implemented through Chat.create_session and Session.ask, with support for streaming via the stream=True parameter in session.py.
Advanced workflows utilize the Agent class with custom DSL definitions for graph-based reasoning.
Long-term memory persists context across sessions using create_memory and search_message methods.

Frequently Asked Questions

How do I install the RAGFlow Python SDK?

Install the SDK using pip from the infiniflow/ragflow repository. The package name is ragflow-sdk, which you can install via pip install ragflow-sdk. Ensure you have Python 3.8 or higher to support the async features used in streaming responses.

What is the difference between Chat and Agent in the RAGFlow SDK?

Chat objects in modules/chat.py provide standard retrieval-augmented generation against fixed datasets, ideal for Q&A bots. Agent objects in modules/agent.py support custom DSL-defined workflows with multiple components and branching logic, enabling complex multi-step reasoning beyond simple document retrieval.

How does the SDK handle authentication and errors?

The RAGFlow class stores your API key and base URL upon initialization, passing the key in the Authorization header for every request. HTTP errors are raised as exceptions by the underlying post, get, put, and delete methods in ragflow.py, allowing you to catch and handle authentication failures or server errors using standard Python try-except blocks.

Can I use custom embedding models and LLMs with the SDK?

Yes. When creating datasets via create_dataset, specify any embedding model available in your RAGFlow instance using the embedding_model parameter. Similarly, create_memory accepts embd_id and llm_id parameters to configure specific models for vectorization and generation, allowing full customization of your RAG pipeline's underlying models.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how infiniflow/ragflow works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →