How LiteParse Processes Rotated Text: Canonical Snapping and Coordinate Transformation

LiteParse normalizes rotated text to upright orientation using a two-stage pipeline that first snaps angles to cardinal directions and then transforms coordinates to preserve visual placement while zeroing out rotation values.

LiteParse, the Rust-based PDF parsing library maintained in the run-llama/liteparse repository, extracts text from PDFs at any angle and converts it to a standardized upright format. Understanding how LiteParse processes rotated text is essential for developers building layout-aware document processing pipelines that require consistent reading order regardless of original text orientation.

Canonical Rotation Snapping

Before coordinate transformation begins, LiteParse normalizes raw rotation values using the canonical_rotation function in crates/liteparse/src/projection.rs (lines 89-110).

This function reduces noise from PDF extraction by converting any rotation angle to the nearest cardinal direction—0°, 90°, 180°, or 270°—when the deviation is within 2 degrees. Angles near 360° wrap back to 0°, ensuring that slight measurement inconsistencies do not create spurious rotation groups. The raw rotation value originates from TextItem.rotation defined in crates/liteparse/src/types.rs (lines 23-25). If the angular distance exceeds the 2-degree threshold, the function preserves the original angle (e.g., 45° remains unchanged).

Transformation Logic for Rotated Text

After canonical snapping, the handle_rotation_reading_order function in crates/liteparse/src/projection.rs (lines 113-210) orchestrates the transformation. This stage groups items by their canonical rotation, clusters vertically separated text, and rewrites coordinates to simulate upright text while setting each item’s rotation field to 0°.

Grouping and Clustering Strategy

The function first checks if any item requires processing (canonical_rotation != 0). It constructs a HashMap<i32, Vec<usize>> called groups_by_rotation to index all items sharing the same canonical angle.

For vertical rotations (90° and 270°), the algorithm sorts items by their y-coordinate and applies a gap threshold of three times the tallest item's height. When the vertical gap between consecutive items exceeds this threshold, the function splits them into separate clusters. This prevents unrelated labels—such as top-leg and bottom-leg annotations on diagrams—from merging into a single column.

Inline vs. Separate Rendering Decisions

Each cluster undergoes overlap detection against non-rotated content:

  • Inline clusters: When rotated text visually overlaps with non-rotated items, the cluster receives a common y-coordinate calculated as the average vertical midpoint. The width becomes the new height, and the rotation field resets to 0° (lines 184-214).

  • Separate clusters: Non-overlapping clusters receive a vertical offset (delta_y) calculated from the preceding group's bottom and the page height, creating distinct reading order lines.

Coordinate Swapping by Rotation Angle

The specific transformation logic varies by canonical rotation angle:

90° Rotation: The function swaps x and y coordinates, sets the new x to the rounded original y, calculates the new y as original x + delta_y, and swaps width and height (lines 263-274).

270° Rotation: Similar to 90°, but the new x calculates as max_y - y - height to account for directional flipping (lines 278-292).

180° Rotation: Items remain in place but sort left-to-right, with rotation zeroed to preserve local ordering (lines 311-327).

After processing all groups, the entire items slice sorts by the new y-coordinate to guarantee top-to-bottom reading order (lines 329-330). Each transformed item has its rotated flag set to true on the ProjectedTextItem struct, while original values persist in orig_* fields.

Implementation Architecture

Several source files collaborate to handle rotated text processing:

  • crates/liteparse/src/types.rs: Defines TextItem.rotation and the rotated flag on ProjectedTextItem that records whether transformation occurred.

  • crates/liteparse/src/projection.rs: Contains canonical_rotation and handle_rotation_reading_order, implementing the full normalization, clustering, and coordinate rewriting pipeline.

  • crates/liteparse/src/parser.rs: Orchestrates the parsing pipeline, calling projection::project_pages_to_grid after OCR processing to invoke the rotation logic.

Detecting Rotated Text in Practice

The following examples demonstrate how to identify originally rotated text using LiteParse bindings.

Python Example

Use the liteparse-python package to detect items with original rotation values:

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("samples/rotated_diagram.pdf")

for page in result.pages:
    print(f"--- Page {page.page_number} ---")
    for itm in page.text_items:
        if itm.rotation != 0:
            print(
                f"Rotated ({itm.rotation}°) → "
                f"x={itm.x:.1f}, y={itm.y:.1f}, "
                f"text='{itm.text}'"
            )

This script outputs the original rotation angle alongside normalized coordinates.

Node.js Example

Filter for rotated items using the liteparse npm package:

import { LiteParse } from "liteparse";

(async () => {
  const lp = new LiteParse();
  const { pages } = await lp.parse("rotated.pdf");

  for (const page of pages) {
    console.log(`Page ${page.pageNumber}`);
    for (const item of page.textItems) {
      if (item.rotation !== 0) {
        console.log(
          `⟳ ${item.rotation}° → (${item.x.toFixed(1)}, ${item.y.toFixed(1)}) "${item.text}"`
        );
      }
    }
  }
})();

While the rotation field reflects the raw PDF angle (normalized to 0° after parsing), the rotated boolean flag on ProjectedTextItem indicates whether transformation occurred.

Rust Integration

For direct library usage:

use liteparse::LiteParse;
use liteparse::config::LiteParseConfig;

#[tokio::main]
async fn main() -> Result<(), liteparse::error::LiteParseError> {
    let cfg = LiteParseConfig::default();
    let parser = LiteParse::new(cfg);
    let result = parser.parse_input(
        liteparse::types::PdfInput::Path("rotated.pdf".into())
    ).await?;
    
    for page in result.pages {
        for itm in page.text_items {
            if itm.rotation != 0.0 {
                println!("Rotated {}° at ({}, {}) → \"{}\"",
                         itm.rotation, itm.x, itm.y, itm.text);
            }
        }
    }
    Ok(())
}

Summary

  • Two-stage pipeline: LiteParse first applies canonical_rotation in projection.rs to snap angles to cardinal directions within a 2-degree tolerance, then processes items through handle_rotation_reading_order to normalize coordinates.

  • Preserved visual placement: The transformation logic swaps x and y coordinates, adjusts offsets, and exchanges width and height values so that text appears in its original location despite having a zero rotation value.

  • Clustering protection: A gap threshold of three times the maximum item height prevents unrelated vertical text groups from merging during layout analysis.

  • Debugging support: Original coordinates remain accessible in orig_* fields, while the rotated flag on ProjectedTextItem marks transformed items for downstream processing.

Frequently Asked Questions

How does LiteParse handle slight variations in rotation angles like 91° or 269°?

LiteParse applies a 2-degree tolerance threshold in the canonical_rotation function. Any angle within 2 degrees of a cardinal direction (0°, 90°, 180°, 270°) snaps to that cardinal. Angles outside this tolerance remain unchanged, though they still undergo coordinate transformation if they represent valid rotated text.

What happens to text rotated at 45 degrees?

Text at 45° receives no special case handling in the canonical rotation stage because it falls outside the 2-degree tolerance for cardinal snapping. However, the handle_rotation_reading_order logic still processes these items, treating them as non-cardinal rotations and attempting to preserve their placement relative to other content while normalizing to upright orientation.

How can I identify which text items were originally rotated after parsing?

Each ProjectedTextItem includes a rotated boolean flag (defined in crates/liteparse/src/types.rs) that indicates whether the item underwent transformation. Additionally, original coordinates persist in orig_* fields, allowing you to compare pre-transformation and post-transformation values for debugging or specialized layout requirements.

Does LiteParse support mixed-direction text on the same page?

Yes. The handle_rotation_reading_order function specifically handles inline rotated text that overlaps with non-rotated content by assigning a common y-coordinate and adjusting dimensions. For spatially separated groups, it calculates vertical offsets (delta_y) to maintain distinct reading order lines while integrating all text into a single top-to-bottom flow.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →