# How LiteParse Handles Rotated Text in PDFs (90°, 180°, 270°): A Technical Deep Dive

> Learn how LiteParse normalizes rotated text in PDFs, transforming coordinates to axis-aligned. Discover its technical approach to handling 90°, 180°, and 270° rotations for accurate data extraction.

- Repository: [LlamaIndex/liteparse](https://github.com/run-llama/liteparse)
- Tags: deep-dive
- Published: 2026-05-30

---

**LiteParse normalizes rotated text during the grid-projection phase by snapping angles to cardinal directions, transforming coordinates so rotated boxes become axis-aligned, and marking items as `rotated` so later pipeline stages treat them as floating objects rather than column anchors.**

Extracting clean, reading-order-correct text from PDFs is challenging when documents contain rotated elements like vertical axis labels, sideways page numbers, or upside-down headers. The `run-llama/liteparse` library solves this automatically during its projection pipeline, ensuring that text appears in logical order regardless of how glyphs are stored in the source file.

## The Rotation Normalization Pipeline

LiteParse processes rotation in the `handle_rotation_reading_order` function within [`crates/liteparse/src/projection.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/projection.rs). This stage runs after text extraction but before final layout analysis, ensuring all coordinate systems are compatible with the grid-based reading-order algorithm.

### Canonical Angle Detection

The first step maps raw rotation angles to cardinal directions. The `canonical_rotation` function (lines 89–111 in [`projection.rs`](https://github.com/run-llama/liteparse/blob/main/projection.rs)) computes the circular distance to the nearest multiple of 90°. If an angle falls within a **±2° tolerance**, it snaps to 0°, 90°, 180°, or 270°. Angles outside this tolerance remain unchanged, though they receive limited specialized handling.

```rust
// Early exit optimization (projection.rs#L13-L19)
if !items.iter().any(|i| i.rotation != 0) {
    return; // No rotation handling needed
}

```

### Grouping and Spatial Clustering

Once canonical angles are determined, the algorithm creates a `HashMap<i32, Vec<usize>>` keyed by rotation value (lines 22–27). For items rotated at **90° or 270°**, the system applies additional spatial clustering to prevent merging unrelated labels that happen to share the same orientation.

The clustering logic (lines 30–48) sorts items by their Y-coordinate and splits them into separate groups whenever the vertical gap exceeds **3× the maximum item height** in that group. This ensures that a vertical label at the top of a diagram remains distinct from one at the bottom.

## Coordinate Transformation Strategies

After grouping, LiteParse applies geometric transformations to make rotated text axis-aligned while preserving visual relationships.

### Handling 90° and 270° Rotations

For **90° rotations**, the library swaps each item’s width and height, then repositions coordinates so the text reads left-to-right. The new X-coordinate becomes the old Y-coordinate (rounded), while the new Y-coordinate derives from the old X plus a running offset (`delta_y`) to maintain relative spacing.

For **270° rotations**, the transformation is similar but calculates the new X-coordinate from the group's bottom edge to preserve natural left-to-right orientation (lines 77–95).

```rust
// Simplified view of the 90° transformation block (projection.rs#L63-L75)
if group_rotation == 90 {
    // Swap dimensions and reposition
    new_x = old_y.round();
    new_y = old_x + delta_y;
    item.rotated = true; // Mark for downstream processing
}

```

### Simplifying 180° Rotations

**180° rotations** require no geometric transformation because a flipped line maintains the same relative positions. The algorithm simply reorders items by their X-coordinate (ascending) and clears the rotation flag (lines 111–124). This corrects the reading direction without altering bounding boxes.

### Inline Overlap Detection

When rotated items visually overlap non-rotated text (such as vertical tick marks inline with paragraph text), LiteParse **flattens** the rotated group onto a common baseline calculated from the average vertical center (lines 84–128). It marks these items with `item.rotated = true` to signal that they should be treated as floating objects, excluded from column-anchor calculations used in standard text flow detection.

## Integration with the Parsing Pipeline

The rotation handling integrates seamlessly into the main extraction workflow. The `handle_rotation_reading_order` function is invoked from `project_to_grid`, which is part of the public `project_pages_to_grid` pipeline called by `LiteParse::parse_input` in [`crates/liteparse/src/parser.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/parser.rs) (around lines 65–70).

The data structures in [`crates/liteparse/src/types.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/types.rs) support this workflow through the `TextItem` struct (holding the raw `rotation` field) and the `ProjectedTextItem` struct (adding the boolean `rotated` flag). After all transformations complete, the system performs a final sort by the new Y-coordinate to ensure proper top-to-bottom reading order (lines 129–130):

```rust
items.sort_by(|a, b| a.item.y.total_cmp(&b.item.y));

```

## Code Examples

### Rust (Core Library)

```rust
use liteparse::LiteParse;
use liteparse::config::LiteParseConfig;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let cfg = LiteParseConfig::default();
    let parser = LiteParse::new(cfg);
    
    // Parses document with mixed rotations automatically
    let result = parser.parse_input(
        liteparse::PdfInput::Path("diagram_with_labels.pdf".into())
    ).await?;
    
    println!("{}", result.text); // Reading-order correct
    Ok(())
}

```

### Node.js

```typescript
import { LiteParse } from "liteparse-node";

const lp = new LiteParse();
const result = await lp.parse("rotated.pdf");
console.log(result.text); // Vertical labels appear inline

```

### Python

```python
from liteparse import LiteParse

lp = LiteParse()
result = lp.parse("rotated.pdf")
print(result.text)  # 180° headers normalized

```

All language bindings delegate to the same Rust core in [`projection.rs`](https://github.com/run-llama/liteparse/blob/main/projection.rs), ensuring consistent rotation handling across environments.

## Summary

- **Canonical snapping**: Angles within ±2° of 90°, 180°, or 270° snap to cardinal directions via `canonical_rotation` in [`projection.rs`](https://github.com/run-llama/liteparse/blob/main/projection.rs).
- **Spatial clustering**: 90°/270° items split into separate groups when vertical gaps exceed 3× the maximum item height.
- **Geometric transformation**: 90° and 270° rotations swap coordinates and reposition boxes to read left-to-right; 180° rotations simply reorder by X-coordinate.
- **Floating object marking**: The `rotated` boolean flag excludes transformed items from column-anchor calculations in downstream layout analysis.
- **Automatic execution**: Rotation handling runs automatically in `project_pages_to_grid` during every parse operation, requiring no configuration flags.

## Frequently Asked Questions

### How does LiteParse detect the rotation angle of text in a PDF?

LiteParse reads the rotation metadata embedded in each PDF text object. During the projection phase, the `canonical_rotation` function in [`crates/liteparse/src/projection.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/projection.rs) (lines 89–111) normalizes these angles by calculating the circular distance to the nearest multiple of 90°. If the raw angle falls within a ±2° tolerance of a cardinal direction, it snaps to exactly 0°, 90°, 180°, or 270°.

### What happens to text rotated at 45 degrees or other non-cardinal angles?

Text rotated at angles like 45°—outside the ±2° tolerance for cardinal directions—retains its original angle value. While the system groups these items separately, they do not receive the specialized coordinate transformations applied to 90°, 180°, or 270° rotations. They remain in their original geometric positions without axis-alignment normalization.

### Does rotation handling significantly impact parsing performance?

Rotation handling adds minimal overhead due to early-exit optimizations. The `handle_rotation_reading_order` function includes a guard clause (lines 13–19) that returns immediately if no items require rotation. Additionally, the algorithm uses efficient HashMap grouping and operates only on bounding box coordinates rather than pixel data, keeping the transformation cost negligible compared to text extraction itself.

### How does the `rotated` flag impact downstream text processing?

When an item is marked with `rotated = true` (set during transformation blocks in [`projection.rs`](https://github.com/run-llama/liteparse/blob/main/projection.rs)), subsequent pipeline stages in [`parser.rs`](https://github.com/run-llama/liteparse/blob/main/parser.rs) treat it as a floating object rather than a column anchor. This prevents vertical labels from disrupting the detection of text columns and ensures that flowing-text algorithms skip these items when calculating paragraph boundaries and reading order.