# LiteParse DPI Settings for OCR: Recommended Configurations by Document Type

> Discover recommended LiteParse DPI settings for optimal OCR performance across various document types. Optimize your scanned documents with these expert configurations.

- Repository: [LlamaIndex/liteparse](https://github.com/run-llama/liteparse)
- Tags: best-practices
- Published: 2026-05-31

---

**LiteParse defaults to 150 DPI for standard PDF processing, recommends 300 DPI for scanned documents requiring OCR, and supports up to 400–600 DPI for low-quality or densely formatted layouts.**

LiteParse is an open-source PDF parsing library that converts PDF pages to raster images before optical character recognition (OCR). The **DPI setting** (dots-per-inch) determines the pixel density of these rendered bitmaps, directly influencing both recognition accuracy and system resource consumption. Configuring the appropriate DPI value for your specific document types ensures optimal text extraction performance within the run-llama/liteparse codebase.

## How DPI Controls OCR Quality in LiteParse

LiteParse's rendering pipeline relies on DPI to calculate the scale factor for PDFium rasterization. In [`crates/liteparse/src/config.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/config.rs), the `LiteParseConfig` struct defines a default DPI of **150** at lines 19-49, which provides a baseline balance between processing speed and visual quality.

When OCR is triggered, [`crates/liteparse/src/ocr_merge.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr_merge.rs) calls `render_pages_for_ocr`, forwarding the configured DPI to the rendering layer. The actual rasterization occurs in [`crates/liteparse/src/render.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/render.rs) (lines 20-24), where the scale factor is computed as `dpi / 72.0`. This means higher DPI values directly increase the pixel dimensions of the generated bitmap, providing OCR engines with more detail for character recognition while also affecting the screenshot API output.

## Recommended DPI Settings for Different OCR Scenarios

LiteParse recommends adjusting DPI based on document characteristics and accuracy requirements:

### Standard PDFs with Selectable Text: 150 DPI

For PDFs containing native text layers where OCR is rarely needed, the default **150 DPI** (as documented in [`docs/src/content/docs/liteparse/cli-reference.md`](https://github.com/run-llama/liteparse/blob/main/docs/src/content/docs/liteparse/cli-reference.md), lines 37-39) provides sufficient resolution for screenshots and occasional OCR while minimizing memory and CPU usage.

### Scanned Documents with Clear, Medium-Size Fonts: 300 DPI

When processing scanned documents, **300 DPI** doubles the pixel density and significantly improves character recognition rates. The CLI reference at [`docs/src/content/docs/liteparse/cli-reference.md`](https://github.com/run-llama/liteparse/blob/main/docs/src/content/docs/liteparse/cli-reference.md) (line 54) demonstrates this configuration in a high-DPI rendering example with French OCR. Additionally, the library usage guide at `docs/src/content/docs/liteparse/guides/library-usage.mdx` (lines 65, 240, and 322) consistently shows `dpi: 300` for OCR-intensive workflows.

### Low-Resolution Scans, Tiny Fonts, or Dense Tables: 400–600 DPI

For challenging documents with small text, noisy scans, or densely packed tables, **400–600 DPI** captures finer details that reduce character misrecognition. This setting trades increased RAM consumption and longer processing times for maximum accuracy in difficult OCR scenarios.

### High-Quality Screenshots for Visual Citations: 300+ DPI

When generating presentation-grade images for visual citations, **300 DPI** or higher provides crisp output while maintaining reasonable file sizes. This practice is reflected in the agent-skill guide at [`docs/src/content/docs/liteparse/guides/agent-skill.md`](https://github.com/run-llama/liteparse/blob/main/docs/src/content/docs/liteparse/guides/agent-skill.md) (line 28), which demonstrates screenshotting pages at 300 DPI.

### Batch Processing Large Archives: 150 DPI or Adaptive

For high-volume processing where speed is critical, maintain the **150 DPI** default or implement adaptive logic that starts at 150 DPI and escalates to 300 DPI only for pages where native text extraction fails.

## Performance Implications of DPI Settings

Increasing DPI roughly squares the pixel count of rendered images. Moving from **150 DPI to 300 DPI** can increase memory usage by approximately 4× and extend OCR processing time by a similar factor. Production deployments should tune this value per project to prevent resource exhaustion, particularly when processing large documents or high page counts.

## Implementation Examples

### Node.js (npm wrapper)

The JavaScript wrapper in [`packages/node/src/lib.ts`](https://github.com/run-llama/liteparse/blob/main/packages/node/src/lib.ts) (line 98) applies the default with `resolved.dpi ?? 150`:

```javascript
import { LiteParse } from "liteparse";

// Default DPI (150) for quick parsing
const parser = new LiteParse({ outputFormat: "json" });
await parser.parse("document.pdf");

// High resolution for scanned documents
const highResParser = new LiteParse({ outputFormat: "json", dpi: 300 });
await highResParser.parse("scanned.pdf");

```

### Python (PyO3 wrapper)

The Python wrapper in [`packages/python/liteparse/parser.py`](https://github.com/run-llama/liteparse/blob/main/packages/python/liteparse/parser.py) (lines 90-111) forwards the DPI parameter to the Rust core:

```python
from liteparse import LiteParse

# Default configuration

parser = LiteParse()
result = parser.parse("report.pdf")

# High-DPI OCR for French language documents

parser_high = LiteParse(dpi=300, ocr_language="fra")
result_high = parser_high.parse("french_scanned.pdf")

```

### CLI

The CLI supports DPI configuration via the `--dpi` flag, documented in [`docs/src/content/docs/liteparse/cli-reference.md`](https://github.com/run-llama/liteparse/blob/main/docs/src/content/docs/liteparse/cli-reference.md):

```bash

# Quick parse with default 150 DPI

lit parse invoice.pdf

# High-resolution OCR for complex layouts

lit parse report.pdf --dpi 300 --ocr-language eng

```

## Summary

- **LiteParse defaults to 150 DPI** in [`crates/liteparse/src/config.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/config.rs), suitable for standard PDFs with selectable text.
- **300 DPI is recommended for scanned documents** and appears throughout the documentation as the standard for OCR-intensive tasks.
- **400–600 DPI addresses edge cases** with tiny fonts or dense tables, at the cost of quadrupled memory usage.
- The rendering pipeline computes scale factors as `dpi / 72.0` in [`crates/liteparse/src/render.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/render.rs), affecting both OCR and screenshot quality.
- All language wrappers (Node.js, Python) and the CLI respect the same DPI configuration parameter.

## Frequently Asked Questions

### What is the default DPI setting in LiteParse?

LiteParse uses a default DPI of **150**, defined in the `LiteParseConfig` struct within [`crates/liteparse/src/config.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/config.rs) (lines 19-49). This value is applied across the Node.js and Python wrappers when no specific DPI is provided, ensuring consistent behavior between library usage and CLI execution.

### How does DPI affect OCR accuracy in LiteParse?

Higher DPI settings increase the pixel density of rendered bitmaps, providing OCR engines with more detail for character recognition. According to the rendering implementation in [`crates/liteparse/src/render.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/render.rs), DPI directly determines the scale factor (calculated as `dpi / 72.0`), meaning 300 DPI produces images with twice the linear resolution and four times the pixel count of 150 DPI, significantly improving recognition accuracy on scanned documents.

### Can I use different DPI settings for different pages in the same document?

LiteParse applies DPI at the parser configuration level, meaning a single DPI value is used for an entire processing session. For adaptive processing, implement logic at the application layer that retries failed pages with higher DPI values, starting with the default 150 DPI and escalating to 300 DPI or higher only when native text extraction or initial OCR fails.

### Where is the DPI setting stored in the LiteParse source code?

The DPI configuration resides in [`crates/liteparse/src/config.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/config.rs) as a field on the `LiteParseConfig` struct, with the default value of 150 defined at lines 19-49. This value propagates through [`crates/liteparse/src/ocr_merge.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr_merge.rs) to the rendering functions in [`crates/liteparse/src/render.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/render.rs), where it is converted to a scaling factor for PDFium rasterization.