# Debugging RAGFlow Applications: 8 Essential Techniques for Micro-Service Troubleshooting

> Master RAGFlow application debugging with 8 essential techniques. Learn to use centralized logging, control verbosity, and attach remote debuggers for efficient micro-service troubleshooting.

- Repository: [InfiniFlow/ragflow](https://github.com/infiniflow/ragflow)
- Tags: best-practices
- Published: 2026-02-23

---

**Effective debugging of RAGFlow applications relies on leveraging the centralized `init_root_logger` system, controlling verbosity dynamically via the `LOG_LEVELS` environment variable, and attaching remote debuggers using the built-in `debugpy` integration when step-through inspection is required.**

RAGFlow is a complex, micro-service-based Retrieval-Augmented Generation engine maintained by infiniflow. When failures occur across its distributed architecture—from the API server to background task executors—systematic debugging practices grounded in the source code are essential for rapid resolution. This guide covers the specific debugging primitives implemented in the repository to diagnose issues efficiently.

## Initialize Centralized Logging with init_root_logger

All long-running processes in RAGFlow must establish consistent logging early in their lifecycle. In [`api/ragflow_server.py`](https://github.com/infiniflow/ragflow/blob/main/api/ragflow_server.py) at line 78, the server initializes logging by calling:

```python
init_root_logger("ragflow_server")

```

This function, defined in [`common/log_utils.py`](https://github.com/infiniflow/ragflow/blob/main/common/log_utils.py) (lines 25-44), configures a **rotating file handler** that writes to `logs/ragflow_server.log` alongside a console stream handler. Each entry includes timestamps, process IDs, and severity levels, enabling you to trace request flows across the API server, admin service, and background workers. Centralizing logs in a single location per service eliminates fragmentation when correlating events across the micro-service architecture.

## Control Log Verbosity Dynamically via LOG_LEVELS

Rather than modifying source code to adjust granularity, use the `LOG_LEVELS` environment variable to set package-specific verbosity at runtime. As implemented in [`common/log_utils.py`](https://github.com/infiniflow/ragflow/blob/main/common/log_utils.py) (lines 48-67), this variable accepts comma-separated package assignments:

```bash
export LOG_LEVELS=peewee=DEBUG,rag=INFO,root=INFO

```

Setting `peewee=DEBUG` surfaces all database queries without flooding logs with root-level noise. Configure this in your `.env` file or `docker-compose` environment to debug connection issues or slow queries in production without redeploying code.

## Capture Full Exception Tracebacks

RAGFlow provides robust exception handling patterns that preserve full stack traces. The `log_exception` helper appears throughout the codebase, including in [`tools/firecrawl/firecrawl_connector.py`](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/firecrawl_connector.py) (line 236) and imported in [`rag/svr/task_executor.py`](https://github.com/infiniflow/ragflow/blob/main/rag/svr/task_executor.py) (line 41).

When an exception bubbles up, `logging.exception` automatically records the traceback. Custom context objects may include a `text` attribute for additional debugging metadata. Implement this pattern in your own extensions:

```python
import logging
from common.log_utils import init_root_logger

init_root_logger("my_plugin")
logger = logging.getLogger(__name__)

def process_document():
    try:
        # Critical retrieval logic

        retrieve_chunks()
    except Exception as e:
        logger.exception("Unhandled error during chunk retrieval")
        raise

```

## Attach Remote Debuggers Using debugpy

For scenarios where log analysis is insufficient, RAGFlow includes built-in support for remote debugging. When the environment variable `RAGFLOW_DEBUGPY_LISTEN` is set to a non-zero port, the server initializes a `debugpy` listener before starting the main loop (see [`api/ragflow_server.py`](https://github.com/infiniflow/ragflow/blob/main/api/ragflow_server.py), lines 97-101):

```bash
export RAGFLOW_DEBUGPY_LISTEN=5678
python -m ragflow.main

```

Attach VS Code using this configuration:

```json
{
  "name": "Attach to RAGFlow",
  "type": "python",
  "request": "attach",
  "connect": { "host": "localhost", "port": 5678 }
}

```

This technique is essential for diagnosing race conditions in the task executor or stepping through complex retrieval logic in the RAG pipeline.

## Verify Runtime Configuration at Startup

Configuration errors often manifest as cryptic failures in downstream components. RAGFlow dumps runtime settings during initialization to aid debugging. In [`api/ragflow_server.py`](https://github.com/infiniflow/ragflow/blob/main/api/ragflow_server.py) (lines 94-96), the application emits the full configuration loaded from [`common/settings.py`](https://github.com/infiniflow/ragflow/blob/main/common/settings.py) when launched with the `--debug` flag:

```bash
python -m ragflow.main --debug

```

This output includes host IPs, port bindings, Redis connection parameters, and feature toggles. Verify these values against your environment variables when services fail to communicate or connect to external dependencies.

## Monitor Background Workers and Signal Handling

RAGFlow relies on background workers for document processing and indexing. The `update_progress` thread acquires distributed Redis locks and logs failures via `logging.exception` (referenced at line 60 in the implementation). Check logs for "update_progress" entries to diagnose Redis connectivity issues or document indexing failures.

Additionally, ensure graceful shutdowns by verifying signal handlers. In [`api/ragflow_server.py`](https://github.com/infiniflow/ragflow/blob/main/api/ragflow_server.py) (lines 69-74), handlers for `SIGINT` and `SIGTERM` close MCP (Micro-Control-Plane) sessions and halt background threads. When debugging startup crashes, confirm these handlers are not suppressing errors—`logging.info` calls within the handlers indicate the shutdown path was triggered.

## Leverage the Test Suite as a Debugging Harness

The `test/` directory contains an extensive pytest suite that serves as an isolated reproduction environment. When investigating bugs, run specific tests with verbose logging enabled:

```bash
uv run pytest test/testcases/test_sdk_api/test_session_management/test_create_session_with_chat_assistant.py::test_create_session_failure -vv --log-level=DEBUG

```

The `-vv` flag provides detailed assertion output, while `--log-level=DEBUG` surfaces internal debug statements configured via `LOG_LEVELS`. Add temporary `logging.debug` statements to narrow the failure scope without restarting the entire application stack.

## Summary

- **Initialize `init_root_logger`** early in every service entry point to ensure consistent log formatting and rotation to `logs/`
- **Control verbosity dynamically** via the `LOG_LEVELS` environment variable (e.g., `peewee=DEBUG`) rather than hardcoding values
- **Capture full tracebacks** using `logger.exception()` in exception handlers to preserve stack context
- **Attach remote debuggers** by setting `RAGFLOW_DEBUGPY_LISTEN` for step-through debugging of race conditions
- **Verify configuration** using the `--debug` startup flag to dump runtime settings from [`common/settings.py`](https://github.com/infiniflow/ragflow/blob/main/common/settings.py)
- **Monitor background workers** through `update_progress` thread logs and Redis lock status indicators
- **Utilize the pytest suite** with verbose flags for isolated bug reproduction without full system restarts

## Frequently Asked Questions

### How do I enable debug logging for database queries in RAGFlow?

Set the environment variable `LOG_LEVELS=peewee=DEBUG` before starting the server. This configures the peewee ORM to log all SQL queries to the rotating log file without modifying source code, as parsed in [`common/log_utils.py`](https://github.com/infiniflow/ragflow/blob/main/common/log_utils.py) (lines 48-67).

### Where does RAGFlow store its log files?

By default, logs are written to the `logs/` directory relative to the project root, with filenames derived from the service name (e.g., `logs/ragflow_server.log`). The rotating file handler defined in [`common/log_utils.py`](https://github.com/infiniflow/ragflow/blob/main/common/log_utils.py) prevents disk space exhaustion by automatically rotating files when they reach size limits.

### Can I debug a RAGFlow instance running inside a Docker container?

Yes. Expose port 5678 (or your chosen port) in your Docker Compose configuration, set the environment variable `RAGFLOW_DEBUGPY_LISTEN=5678`, and attach your debugger to the container's exposed port. Ensure your development machine has network access to the container and that the port mapping is correctly configured in [`docker-compose.yml`](https://github.com/infiniflow/ragflow/blob/main/docker-compose.yml).

### How should I handle exceptions in custom RAGFlow plugins?

Import `init_root_logger` from `common/log_utils` and initialize it with your plugin name at module load time. Wrap critical operations in try-except blocks and use `logger.exception()` to capture full stack traces before re-raising. This ensures your plugin's errors appear in the centralized logs with the same formatting and rotation rules as core RAGFlow components.