# How to Use the RAGFlow Python SDK to Build Custom RAG Applications

> Build custom RAG applications with the RAGFlow Python SDK. Programmatically manage datasets, parse docs, and create chat sessions effortlessly without HTTP boilerplate.

- Repository: [InfiniFlow/ragflow](https://github.com/infiniflow/ragflow)
- Tags: how-to-guide
- Published: 2026-02-23

---

**The RAGFlow Python SDK provides an object-oriented wrapper around the RAGFlow REST API, enabling developers to programmatically manage datasets, parse documents, create chat sessions, and execute retrieval-augmented generation without writing HTTP boilerplate.**

The RAGFlow Python SDK, available in the `infiniflow/ragflow` repository, offers a thin yet expressive interface for building custom RAG applications. It abstracts complex REST endpoints into intuitive domain objects like `DataSet`, `Chat`, and `Agent`, allowing you to construct production-ready retrieval pipelines using native Python syntax. The SDK architecture centers on three layers: the **Client** (`RAGFlow` class), **Domain Objects** (datasets, chats, agents), and **Session/Message** handlers for inference.

## Architecture of the RAGFlow Python SDK

The SDK follows a layered design that maps directly to REST endpoints while hiding implementation details. At the foundation lies the **Client** layer in [`sdk/python/ragflow_sdk/ragflow.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/ragflow.py), where the `RAGFlow` class manages API keys, base URLs, and low-level HTTP helpers (`post`, `get`, `put`, `delete`).

Domain objects such as `DataSet`, `Chat`, `Agent`, and `Memory` reside in `sdk/python/ragflow_sdk/modules/` and inherit from a **Base** class defined in [`base.py`](https://github.com/infiniflow/ragflow/blob/main/base.py). This base automatically maps JSON responses to Python attributes and provides convenience methods like `to_json`. For interactive inference, the `Session` and `Message` classes in [`session.py`](https://github.com/infiniflow/ragflow/blob/main/session.py) handle both synchronous and streaming execution.

## Initializing the SDK Client

To begin building custom RAG applications, instantiate the `RAGFlow` client with your API credentials and server endpoint. The constructor is defined in [`sdk/python/ragflow_sdk/ragflow.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/ragflow.py).

```python
from ragflow_sdk import RAGFlow

# Initialize the client

rag = RAGFlow(
    api_key="YOUR_RAGFLOW_API_KEY",
    base_url="http://127.0.0.1",  # Your RAGFlow server address

)

```

This client instance serves as the entry point for all subsequent operations, including dataset management, chat creation, and agent deployment.

## Creating and Managing Datasets

Datasets represent collections of documents that power your retrieval system. The `DataSet` class in [`sdk/python/ragflow_sdk/modules/dataset.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/modules/dataset.py) provides methods for creation, document upload, and asynchronous parsing.

```python

# Create a dataset with a specific embedding model

dataset = rag.create_dataset(
    name="legal_contracts",
    description="Contract documents for Q&A",
    embedding_model="text-embedding-ada-002"
)

# Upload binary files

files = [
    {"display_name": "contract1.pdf", "blob": open("contract1.pdf", "rb")},
    {"display_name": "contract2.pdf", "blob": open("contract2.pdf", "rb")},
]
documents = dataset.upload_documents(files)

# Trigger async parsing and wait for completion

doc_ids = [doc.id for doc in documents]
dataset.parse_documents(doc_ids)

```

The `upload_documents` method returns a list of `Document` objects, while `parse_documents` initiates the chunking and embedding process asynchronously.

## Building Chat-Based RAG Assistants

For conversational interfaces, the SDK provides the `Chat` class in [`sdk/python/ragflow_sdk/modules/chat.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/modules/chat.py). You associate datasets with a chat to ground responses in your uploaded documents.

```python

# Create a chat assistant linked to your dataset

chat = rag.create_chat(
    name="Legal Assistant",
    dataset_ids=[dataset.id]
)

# Initialize a conversation session

session = chat.create_session(name="User Query Session")

# Execute a non-streaming query

for message in session.ask("What is the termination clause in contract1?"):
    print(message.content)

```

The `Session.ask` method in [`sdk/python/ragflow_sdk/modules/session.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/modules/session.py) yields `Message` objects containing the generated response and relevant citations.

## Implementing Streaming Responses

For real-time user interfaces, enable streaming to receive tokens as they are generated. Set `stream=True` when calling `Session.ask`:

```python

# Stream tokens for interactive UIs

for msg in session.ask("Summarize the liability section.", stream=True):
    print(msg.content, end="", flush=True)

```

The SDK handles Server-Sent Events (SSE) internally, yielding `Message` instances as each chunk arrives from the server. This implementation resides in the `Session` class within [`session.py`](https://github.com/infiniflow/ragflow/blob/main/session.py).

## Deploying Custom Agents with DSL

Beyond simple chats, the RAGFlow Python SDK supports graph-based **Agent** workflows defined via Domain Specific Language (DSL). The `Agent` class in [`sdk/python/ragflow_sdk/modules/agent.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/modules/agent.py) enables complex multi-step reasoning.

```python

# Define a custom workflow using DSL

dsl = {
    "components": {
        "begin": {"downstream": ["Answer"], "obj": {"component_name": "Begin", "params": {}}}
    },
    "graph": {"nodes": [], "edges": []}
}

# Create the agent

agent = rag.create_agent(title="Custom Research Agent", dsl=dsl)

# Start an agent session

agent_session = agent.create_session()
for msg in agent_session.ask("Explain the pricing model.", stream=False):
    print(msg.content)

```

The DSL structure defines components and their relationships, allowing you to construct sophisticated reasoning chains beyond standard RAG retrieval.

## Persisting Long-Term Memory

For applications requiring cross-session context, the SDK provides memory management through `create_memory`, `add_message`, and `search_message` methods implemented in [`sdk/python/ragflow_sdk/ragflow.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/ragflow.py) (lines 94-122).

```python

# Create a memory store

memory = rag.create_memory(
    name="conversation_memory",
    memory_type=["embedding"],
    embd_id="text-embedding-ada-002",
    llm_id="gpt-3.5-turbo"
)

# Store interaction context

rag.add_message(
    memory_id=[memory.id],
    agent_id=agent.id,
    session_id=agent_session.id,
    user_input="What are the payment terms?",
    agent_response="The payment terms are 30 days net."
)

# Retrieve relevant historical context

hits = rag.search_message(
    query="payment terms",
    memory_id=[memory.id],
    top_n=5
)

```

This persistence layer enables semantic search across previous interactions, allowing your custom RAG applications to maintain context across multiple sessions.

## Summary

- **The RAGFlow Python SDK** wraps the REST API in [`sdk/python/ragflow_sdk/ragflow.py`](https://github.com/infiniflow/ragflow/blob/main/sdk/python/ragflow_sdk/ragflow.py), providing a `RAGFlow` client class that handles authentication and HTTP operations.
- **Domain objects** like `DataSet`, `Chat`, and `Agent` inherit from a `Base` class in [`modules/base.py`](https://github.com/infiniflow/ragflow/blob/main/modules/base.py), automatically mapping JSON responses to Python attributes.
- **Document workflows** use `create_dataset`, `upload_documents`, and `parse_documents` to ingest and process files for retrieval.
- **Conversational AI** is implemented through `Chat.create_session` and `Session.ask`, with support for streaming via the `stream=True` parameter in [`session.py`](https://github.com/infiniflow/ragflow/blob/main/session.py).
- **Advanced workflows** utilize the `Agent` class with custom DSL definitions for graph-based reasoning.
- **Long-term memory** persists context across sessions using `create_memory` and `search_message` methods.

## Frequently Asked Questions

### How do I install the RAGFlow Python SDK?

Install the SDK using pip from the `infiniflow/ragflow` repository. The package name is `ragflow-sdk`, which you can install via `pip install ragflow-sdk`. Ensure you have Python 3.8 or higher to support the async features used in streaming responses.

### What is the difference between Chat and Agent in the RAGFlow SDK?

**Chat** objects in [`modules/chat.py`](https://github.com/infiniflow/ragflow/blob/main/modules/chat.py) provide standard retrieval-augmented generation against fixed datasets, ideal for Q&A bots. **Agent** objects in [`modules/agent.py`](https://github.com/infiniflow/ragflow/blob/main/modules/agent.py) support custom DSL-defined workflows with multiple components and branching logic, enabling complex multi-step reasoning beyond simple document retrieval.

### How does the SDK handle authentication and errors?

The `RAGFlow` class stores your API key and base URL upon initialization, passing the key in the Authorization header for every request. HTTP errors are raised as exceptions by the underlying `post`, `get`, `put`, and `delete` methods in [`ragflow.py`](https://github.com/infiniflow/ragflow/blob/main/ragflow.py), allowing you to catch and handle authentication failures or server errors using standard Python try-except blocks.

### Can I use custom embedding models and LLMs with the SDK?

Yes. When creating datasets via `create_dataset`, specify any embedding model available in your RAGFlow instance using the `embedding_model` parameter. Similarly, `create_memory` accepts `embd_id` and `llm_id` parameters to configure specific models for vectorization and generation, allowing full customization of your RAG pipeline's underlying models.