# How to Handle Vertical Text and Complex Layouts in Tesseract OCR

> Master Tesseract OCR for vertical text and complex layouts. Learn how Tesseract internally rotates text, detects orientation, and segments tables and columns using advanced algorithms.

- Repository: [tesseract-ocr/tesseract](https://github.com/tesseract-ocr/tesseract)
- Tags: how-to-guide
- Published: 2026-03-02

---

**Tesseract OCR handles vertical text and complex layouts by internally rotating vertical blocks 90° counter-clockwise during layout analysis, detecting text orientation via gradient projections in [`textlineprojection.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/textlineprojection.cpp), and using vector-based algorithms in [`tabfind.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/tabfind.cpp) to segment tables and columns.**

Tesseract OCR, the open-source optical character recognition engine maintained by `tesseract-ocr/tesseract`, provides robust support for scripts written vertically (such as Japanese, Chinese, and Mongolian) and sophisticated page analysis for complex document structures. Understanding how to leverage these capabilities requires knowledge of the **PageSegMode** enum, internal rotation mechanisms, and layout analysis pipeline implemented across the `src/textord/` module.

## Detecting Vertical Text Orientation

Tesseract identifies vertical text blocks through a combination of gradient analysis and explicit block typing. In [`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h), the `PolyBlockType` enum defines `PT_VERTICAL_TEXT` to mark blobs belonging to vertically-oriented blocks. The layout engine evaluates orientation using horizontal and vertical gradient projections in [`src/textord/textlineprojection.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/textlineprojection.cpp) and [`src/textord/textlineprojection.h`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/textlineprojection.h).

When processing a page, the projection code computes both horizontal and vertical gradients. A negative gradient score indicates a vertical line, allowing the same algorithmic path to handle rotated text without requiring separate logic branches. This detection occurs automatically when the `detect_vertical_text` variable is enabled (default: true), though you can force vertical handling explicitly for specific use cases.

## Forcing Vertical Block Processing with PageSegMode

For documents containing purely vertical text, Tesseract exposes `PSM_SINGLE_BLOCK_VERT_TEXT` through the `TessBaseAPI::SetPageSegMode` method defined in [`src/ccmain/tesseractclass.h`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccmain/tesseractclass.h). When this mode is active, the engine performs specific rotational transformations documented in [`src/textord/textord.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/textord.cpp) (lines 200–217).

The process works as follows:

1. Each `TO_BLOCK` is wrapped in a `POLY_BLOCK` of type `PT_VERTICAL_TEXT`
2. The block is rotated 90° counter-clockwise using `rotate(anticlockwise90)`
3. Standard layout analysis (`make_rows`, `BaselineDetect`, `make_words`) executes on the rotated image
4. Re-rotation fields (`set_re_rotation`, `set_classify_rotation`) restore original geometry for classification

```cpp
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>

int main() {
  tesseract::TessBaseAPI api;
  if (api.Init(nullptr, "jpn")) return 1;

  // Force single vertical block processing
  api.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK_VERT_TEXT);
  
  Pix *pix = pixRead("vertical_page.png");
  api.SetImage(pix);
  
  char *out = api.GetUTF8Text();
  printf("%s\n", out);
  api.End();
  pixDestroy(&pix);
  return 0;
}

```

## Analyzing Complex Layouts and Tables

Tesseract handles complex page structures—such as tables, multi-column documents, and mixed-orientation pages—through the `TabFind` class in [`src/textord/tabfind.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/tabfind.cpp) and `TabVector` management in [`src/textord/tabvector.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/tabvector.cpp). The algorithm discovers near-vertical tab-stop vectors, merges similar vectors using `TabVector::MergeSimilarTabVectors`, and uses these to split pages into rows and columns.

The system automatically adapts to both horizontal and vertical tables because vector orientation derives from the actual data rather than preconceived layout assumptions. The `textord_tabvector_vertical_gap_fraction` parameter controls how aggressively vertical gaps are interpreted as table separators.

```cpp
// Preserve table structure during OCR
api.SetPageSegMode(tesseract::PSM_SPARSE);
api.SetVariable("textord_tabvector_vertical_gap_fraction", "0.5");

```

For mixed-orientation documents (e.g., horizontal body text with vertical captions), the pipeline in [`src/textord/textord.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/textord.cpp) extracts connected components via `find_components` and `filter_blobs`, then groups them into `TO_BLOCK` structures. Each block maintains its own rotation flag, allowing simultaneous processing of differently-oriented regions without manual intervention.

## Handling Vertical Underlines and Baselines

Vertical scripts require specialized underline detection implemented in [`src/textord/underlin.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/underlin.cpp). The `vertical_cunderline_projection` function projects underline outlines vertically to establish baselines for vertical writing systems, ensuring proper character alignment during the textline formation phase.

## Generating Synthetic Training Data

When training custom models for vertical scripts, [`src/training/text2image.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/training/text2image.cpp) supports vertical text rendering through the `render.set_vertical_text(true)` method. Running `text2image` with `--writing_mode vertical` produces training images rotated 90° with corresponding `.box` files containing correctly transformed coordinates.

The [`src/training/pango/boxchar.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/training/pango/boxchar.cpp) module includes `MostlyVertical` logic to analyze line orientation during ground-truth generation, inserting appropriate line breaks and spaces for vertical text flow.

```cpp
// Example from text2image.cpp
bool vertical = (FLAGS_writing_mode == "vertical");
render.set_vertical_text(vertical);

```

## Key Configuration Variables

Tesseract exposes several variables to tune vertical text and layout handling:

- **`detect_vertical_text`**: Boolean (default true) enabling automatic vertical gradient analysis
- **`textord_tabvector_vertical_gap_fraction`**: Float controlling table column detection sensitivity
- **PageSegMode options**: `PSM_AUTO` for automatic detection, `PSM_SINGLE_BLOCK_VERT_TEXT` for forced vertical processing

## Summary

- **Tesseract detects vertical text** using gradient projections in [`textlineprojection.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/textlineprojection.cpp) and marks blocks with `PT_VERTICAL_TEXT` in [`publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/publictypes.h)
- **Force vertical processing** by setting `PSM_SINGLE_BLOCK_VERT_TEXT`, which triggers 90° rotation logic in [`textord.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/textord.cpp) lines 200–217
- **Complex layouts** are parsed using `TabFind` and `TabVector` classes that detect column and table structures via vertical gap analysis
- **Training data** for vertical scripts is generated using [`text2image.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/text2image.cpp) with the `set_vertical_text` flag
- **Mixed-orientation pages** are supported through per-block rotation flags set during the `TO_BLOCK` creation phase

## Frequently Asked Questions

### How do I enable automatic vertical text detection in Tesseract OCR?

Automatic vertical text detection is enabled by default via the `detect_vertical_text` variable. When using `PSM_AUTO`, the engine evaluates vertical gradients in [`src/textord/textlineprojection.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/textlineprojection.cpp) and automatically rotates blocks that meet the vertical threshold. No additional API calls are required unless you need to force specific behavior.

### What is the difference between `PSM_AUTO` and `PSM_SINGLE_BLOCK_VERT_TEXT`?

`PSM_AUTO` analyzes the entire page and detects orientation per-block using gradient analysis, while `PSM_SINGLE_BLOCK_VERT_TEXT` treats the entire input as a single vertical block, forcing a 90° counter-clockwise rotation in [`src/textord/textord.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/textord.cpp) before processing. Use the latter for pure vertical documents like traditional Japanese manuscripts.

### How does Tesseract handle tables with both horizontal and vertical text?

Tesseract uses the `TabFind` class in [`src/textord/tabfind.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/textord/tabfind.cpp) to detect near-vertical tab vectors that define column boundaries. Each `TO_BLOCK` maintains independent rotation state, allowing the engine to process horizontal and vertical regions within the same table. Adjust `textord_tabvector_vertical_gap_fraction` to tune detection sensitivity for complex grid layouts.

### Can I train Tesseract on custom vertical fonts?

Yes. Use [`src/training/text2image.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/training/text2image.cpp) with the `--writing_mode vertical` flag, which sets `render.set_vertical_text(true)`. This generates rotated training images and corresponding box files via [`src/training/pango/boxchar.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/training/pango/boxchar.cpp), which detects "mostly vertical" lines to ensure proper coordinate mapping during ground-truth generation.