# How RAGFlow Implements Grounded Citations for Retrieved Information: A Technical Deep Dive

> Discover how RAGFlow ensures traceable answers with grounded citations. Learn about its three-stage pipeline: prompt templates, lexical chunk matching, and LLM enrichment. Infiniflow/ragflow.

- Repository: [InfiniFlow/ragflow](https://github.com/infiniflow/ragflow)
- Tags: deep-dive
- Published: 2026-02-23

---

**RAGFlow implements grounded citations through a three-stage pipeline that combines static prompt templates, lexical chunk matching via `insert_citations()`, and LLM-driven citation enrichment to ensure every answer is traceable to its source documents.**

Grounded citations transform opaque AI responses into verifiable, trustworthy outputs by anchoring every claim to its original source. In the `infiniflow/ragflow` open-source repository, the implementation of grounded citations for retrieved information relies on a tightly coupled architecture that binds retrieval results to generated answers through explicit source markers and LLM formatting instructions.

## The Three-Stage Citation Pipeline

RAGFlow’s citation system operates as a sequential pipeline that bridges retrieval and generation. The process guarantees that every sentence in the final answer can be mapped back to specific document chunks or knowledge graph entities.

### Stage 1: Citation Prompt Definition

The foundation of RAGFlow’s citation system rests on **static prompt templates** that enforce strict formatting rules. The files [`rag/prompts/citation_prompt.md`](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/citation_prompt.md) and [`rag/prompts/citation_plus.md`](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/citation_plus.md) define the exact citation format the LLM must follow, including instructions to place citations at the end of sentences and limit them to a maximum of four per sentence.

These templates are loaded by the prompt generator in [`rag/prompts/generator.py`](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/generator.py) and exposed through two helper functions: `citation_prompt()` and `citation_plus()` (lines 159-191). The `citation_prompt()` function retrieves the base style guide, while `citation_plus()` loads the instruction set that asks the LLM to replace placeholder markers with properly formatted citations.

### Stage 2: Retrieval and Chunk Matching

When a user query enters the system, the `Retriever` class—implemented in [`rag/nlp/search.py`](https://github.com/infiniflow/ragflow/blob/main/rag/nlp/search.py)—returns the most relevant text chunks alongside their vector similarity scores. After the LLM generates a raw answer, the method `insert_citations()` (lines 177-190) processes the text sentence-by-sentence.

This method performs **lexical overlap analysis** to match each sentence against the retrieved chunks. When a match is detected, `insert_citations()` injects a citation marker directly into the text. The system uses two distinct marker types:
- **[DC]** markers for document chunks, formatted as `[DC] <file_path>`
- **[KG]** markers for knowledge graph entities, formatted as `[KG] <entity_id>`

These markers serve as temporary placeholders that survive the initial generation phase but require further processing to become human-readable citations.

### Stage 3: LLM-Driven Citation Enrichment

The raw answer containing placeholder markers is not returned directly to the user. Instead, the system performs a second LLM invocation for **citation formatting**. This step is orchestrated in [`api/db/services/dialog_service.py`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py) (lines 607-631), where the service concatenates the system prompt with the output of `citation_prompt()` and streams the response through the LLM.

During this phase, the `citation_plus` prompt instructs the model to replace all `[DC]` and `[KG]` placeholders with numbered citations (e.g., `[1]`, `[2]`) and append a "References" section listing the concrete sources. The final streamed response contains the fully formatted answer with clickable references to supporting documents, as documented in the project’s README (lines 121-124).

## Implementation Details and Code Example

The following Python pattern illustrates the complete citation workflow as implemented in the RAGFlow source code:

```python
from rag.prompts.generator import citation_prompt, citation_plus

# 1️⃣ Load the citation prompt (used as system prompt)

system_prompt = "You are a helpful assistant.\n"
system_prompt += citation_prompt()               # adds citation guidelines

# 2️⃣ Retrieve relevant chunks (simplified)

retriever = MyRetriever(...)
chunks, vectors = retriever.search(query)

# 3️⃣ Generate a raw answer (LLM call omitted for brevity)

raw_answer = llm.generate(system_prompt, query)

# 4️⃣ Insert placeholder citations based on chunk overlap

answer_with_marks, idx = retriever.insert_citations(
    raw_answer, chunks, vectors
)

# 5️⃣ Ask the LLM to replace placeholders with formatted citations

final_answer = llm.generate(
    citation_plus("\n".join(answer_with_marks)),   # citation_plus adds the "add citations" prompt

    answer_with_marks
)

print(final_answer)   # → answer + “[1] …” and a “References” list at the end

```

## Key Source Files and Their Roles

Understanding the grounded citations implementation requires familiarity with these specific components:

- **[`rag/prompts/citation_prompt.md`](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/citation_prompt.md)** – Defines the base citation style guide that constrains how the LLM formats source references.

- **[`rag/prompts/citation_plus.md`](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/citation_plus.md)** – Contains the instruction template that explicitly asks the LLM to replace placeholder markers with final citation numbers and generate the reference list.

- **[`rag/prompts/generator.py`](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/generator.py)** – Houses the `citation_prompt()` and `citation_plus()` functions (lines 159-191) that load markdown templates and expose them to the retrieval pipeline.

- **[`rag/nlp/search.py`](https://github.com/infiniflow/ragflow/blob/main/rag/nlp/search.py)** – Implements the `insert_citations()` method (lines 177-190) that performs lexical matching between answer sentences and retrieved chunks to inject `[DC]` or `[KG]` markers.

- **[`api/db/services/dialog_service.py`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py)** – Orchestrates the end-to-end citation injection during chat sessions (lines 607-631), managing the transition from raw generation to citation-enriched output.

- **[`README.md`](https://github.com/infiniflow/ragflow/blob/main/README.md)** – Documents the user-facing benefits of traceable citations and the "quick view" functionality for source documents (lines 121-124).

## Summary

RAGFlow’s approach to grounded citations ensures verifiable, hallucination-resistant AI responses through a rigorous three-step process:

- **Centralized format control** via version-controlled markdown templates that standardize citation appearance across all outputs.
- **Lexical chunk matching** through `insert_citations()`, which binds specific sentences to their supporting evidence using `[DC]` and `[KG]` markers.
- **Explicit LLM formatting** using the `citation_plus` prompt to convert technical placeholders into readable reference numbers and bibliography entries.

This architecture guarantees that every generated answer includes a traceable provenance, dramatically increasing user trust and enabling direct verification against source materials.

## Frequently Asked Questions

### What is the difference between citation_prompt.md and citation_plus.md?

**[`citation_prompt.md`](https://github.com/infiniflow/ragflow/blob/main/citation_prompt.md)** defines the static style guidelines that constrain how citations should appear in the final output, such as placement rules and maximum counts per sentence. **[`citation_plus.md`](https://github.com/infiniflow/ragflow/blob/main/citation_plus.md)** contains the active instruction set that tells the LLM to perform the replacement operation—converting placeholder markers like `[DC]` into numbered citations `[1]` and generating the "References" section. The former is used during initial generation; the latter drives the enrichment phase.

### How does RAGFlow match answer sentences to specific chunks?

The system uses the `insert_citations()` method in [`rag/nlp/search.py`](https://github.com/infiniflow/ragflow/blob/main/rag/nlp/search.py) (lines 177-190) to perform **lexical overlap analysis**. This function iterates through the raw answer sentence-by-sentence, compares the text against the content of retrieved chunks, and injects `[DC]` or `[KG]` markers where semantic similarity indicates a supporting relationship. This approach ensures citations align with the actual content of the retrieved documents rather than being hallucinated by the LLM.

### What do the [DC] and [KG] citation markers represent?

**[DC]** stands for "Document Chunk" and is formatted as `[DC] <file_path>`, linking the citation to a specific segment of an uploaded document. **[KG]** stands for "Knowledge Graph" and appears as `[KG] <entity_id>`, connecting the citation to structured entities within the system’s knowledge graph. These markers serve as temporary placeholders during the generation pipeline before the LLM converts them into sequential numbered citations.

### Which component orchestrates the final citation formatting?

The **[`dialog_service.py`](https://github.com/infiniflow/ragflow/blob/main/dialog_service.py)** file in `api/db/services/` (lines 607-631) manages the orchestration. This service handles the conversation state, concatenates the system prompt with citation instructions, and invokes the LLM stream twice: once for the initial answer generation and again for the citation enrichment phase using the `citation_plus()` prompt template. This centralization ensures consistent citation handling across all chat interactions in the RAGFlow system.