# PolyBlock Types in Tesseract: A Complete Guide to Layout Analysis

> Understand Tesseract PolyBlock Types for effective OCR layout analysis. Classify page regions like text, images, and tables to optimize processing.

- Repository: [tesseract-ocr/tesseract](https://github.com/tesseract-ocr/tesseract)
- Tags: deep-dive
- Published: 2026-03-02

---

**PolyBlock Types are an enumeration that classifies detected page regions—such as text columns, headings, images, tables, and equations—allowing Tesseract to perform layout analysis and apply region-specific processing during OCR.**

The tesseract-ocr/tesseract engine represents each contiguous region of a document as a **POLY_BLOCK** with a specific **PolyBlockType**. This type system, defined in [`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h), drives the layout analysis pipeline by categorizing content so that text extraction, image isolation, or table detection can be handled appropriately.

## Understanding the PolyBlockType Enumeration

At the core of Tesseract’s layout analysis is the `PolyBlockType` enum declared in [`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h). This enumeration assigns a logical kind to every block detected during page segmentation.

```cpp
// include/tesseract/publictypes.h
enum PolyBlockType {
  PT_UNKNOWN,          // not yet classified
  PT_FLOWING_TEXT,     // normal column‑wise text
  PT_HEADING_TEXT,     // text spanning multiple columns
  PT_PULLOUT_TEXT,     // cross‑column pull‑out text
  PT_EQUATION,         // block belonging to an equation region
  PT_INLINE_EQUATION,  // inline equation inside text
  PT_TABLE,            // table region
  PT_VERTICAL_TEXT,    // vertically‑oriented text lines
  PT_CAPTION_TEXT,     // text belonging to an image
  PT_FLOWING_IMAGE,    // image inside a column
  PT_HEADING_IMAGE,    // image spanning columns
  PT_PULLOUT_IMAGE,    // pull‑out image
  PT_HORZ_LINE,        // horizontal line
  PT_VERT_LINE,        // vertical line
  PT_NOISE,            // stray marks outside any column
  PT_COUNT
};

```

Rather than comparing raw integers, use the inline predicates also defined in [`publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/publictypes.h) to test block categories:

- **`PTIsTextType()`** – Returns true for flowing text, headings, pull-outs, captions, vertical text, and equations.
- **`PTIsImageType()`** – Identifies flowing, heading, and pull-out images.
- **`PTIsLineType()`** – Matches horizontal and vertical lines.
- **`PTIsPulloutType()`** – Detects pull-out text or images spanning multiple columns.

## How Tesseract Determines Block Types During Layout Analysis

During page segmentation, the engine groups connected components into **`ColPartition`** objects (see [`src/ccstruct/colpartition.h`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccstruct/colpartition.h) and [`src/ccstruct/colpartition.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccstruct/colpartition.cpp)). Each partition examines geometric properties—such as column spanning, height, width, and neighbor relationships—to determine its content category.

After partitioning, each `ColPartition` instantiates a **`POLY_BLOCK`** (defined in [`src/ccstruct/polyblk.h`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccstruct/polyblk.h) and implemented in [`src/ccstruct/polyblk.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccstruct/polyblk.cpp)). This object stores the block’s bounding polygon and the assigned `PolyBlockType`. The type assignment drives downstream processing: for example, OCR is executed only on text-bearing types like `PT_FLOWING_TEXT`, `PT_HEADING_TEXT`, and `PT_TABLE`, while image blocks can be extracted for separate handling.

For visual debugging, `POLY_BLOCK::ColorForPolyBlockType()` in [`polyblk.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/polyblk.cpp) maps each enum value to a specific color, ensuring that layout visualization tools display flowing text in green, headings in blue, tables in cyan, and other categories in distinct colors.

## Accessing PolyBlock Types via the C++ API

The C++ API exposes block types through `PageIterator::BlockType()`, implemented in [`src/ccmain/pageiterator.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccmain/pageiterator.cpp). The following example demonstrates how to enumerate blocks and handle each type appropriately:

```cpp
#include <tesseract/baseapi.h>
#include <tesseract/publictypes.h>
#include <tesseract/pageiterator.h>

int main() {
  tesseract::TessBaseAPI api;
  api.Init(nullptr, "eng");                // initialise with English language
  api.SetImage("sample.png");               // your input image
  api.Recognize(nullptr);                  // run layout + OCR

  // Obtain a PageIterator that walks the layout hierarchy.
  tesseract::PageIterator* it = api.AnalyseLayout();
  if (!it) return 1;

  do {
    // Retrieve the block type for the current block.
    tesseract::PolyBlockType blk = it->BlockType();

    // Simple handling based on block category.
    if (tesseract::PTIsTextType(blk)) {
      // Text block – extract the text.
      const char* txt = it->GetUTF8Text(tesseract::RIL_BLOCK);
      printf("TEXT (%d): %s\n", blk, txt);
      delete[] txt;
    } else if (tesseract::PTIsImageType(blk)) {
      // Image block – you could save the image region.
      printf("IMAGE block (type %d)\n", blk);
    } else if (tesseract::PTIsLineType(blk)) {
      printf("LINE block (type %d)\n", blk);
    } else if (blk == tesseract::PT_TABLE) {
      printf("TABLE block detected\n");
    } else {
      printf("OTHER block (type %d)\n", blk);
    }
  } while (it->Next(tesseract::RIL_BLOCK));   // advance to next block

  api.End();                                 // clean up
  return 0;
}

```

This pattern leverages the predicate helpers to avoid verbose switch statements when filtering for specific layout elements.

## Using PolyBlock Types with the C API

The legacy C API provides equivalent functionality through `TessPageIteratorBlockType`, declared in [`include/tesseract/capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h). This wrapper returns the same enumeration values for use in C-based applications:

```c
#include <tesseract/capi.h>

int main() {
  TessBaseAPI* api = TessBaseAPICreate();
  TessBaseAPIInit3(api, NULL, "eng");
  TessBaseAPISetImage2(api, PixRead("sample.png"));
  TessBaseAPIRecognize(api, NULL);

  TessPageIterator* it = TessBaseAPIGetIterator(api);
  if (!it) return 1;

  do {
    TessPolyBlockType blk = TessPageIteratorBlockType(it);
    if (blk == PT_FLOWING_TEXT || blk == PT_HEADING_TEXT) {
      char* txt = TessResultIteratorGetUTF8Text(it, RIL_BLOCK);
      printf("TEXT (%d): %s\n", blk, txt);
      TessDeleteText(txt);
    } else if (blk == PT_FLOWING_IMAGE) {
      printf("IMAGE block (%d)\n", blk);
    }
  } while (TessPageIteratorNext(it, RIL_BLOCK));

  TessBaseAPIEnd(api);
  TessBaseAPIDelete(api);
  return 0;
}

```

Both APIs provide access to the same underlying layout data structures, allowing you to build custom pipelines that process tables, ignore noise regions, or extract images based on their classified types.

## Visualizing Block Types for Debugging

Tesseract can generate debug images that color-code blocks according to their `PolyBlockType`. The mapping occurs in `POLY_BLOCK::ColorForPolyBlockType()` in [`polyblk.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/polyblk.cpp), which returns a `ScrollView::Color` for each enum value:

```cpp
// src/ccstruct/polyblk.cpp – colour mapping
ScrollView::Color POLY_BLOCK::ColorForPolyBlockType(PolyBlockType type) {
  static const ScrollView::Color kPBColors[] = {
    ScrollView::MAGENTA,    // PT_UNKNOWN
    ScrollView::GREEN,      // PT_FLOWING_TEXT
    ScrollView::BLUE,       // PT_HEADING_TEXT
    // … (kept in sync with enum order)
  };
  return kPBColors[static_cast<int>(type)];
}

```

Enable visual debugging via command line:

```bash
tesseract sample.png out -c debug_file=debug.tif

```

The output image displays flowing text in green, headings in blue, tables in distinct colors, and noise in magenta, allowing you to verify that Tesseract’s layout analysis correctly identified document regions.

## Summary

- **PolyBlockType** (defined in [`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h)) categorizes every detected region during Tesseract’s layout analysis, including text, images, tables, equations, and lines.
- The **`ColPartition`** class determines block types based on geometry and content, storing results in **`POLY_BLOCK`** objects.
- Use predicate helpers like **`PTIsTextType()`** and **`PTIsImageType()`** to filter blocks efficiently without manual enum comparisons.
- Access block types programmatically via **`PageIterator::BlockType()`** (C++) or **`TessPageIteratorBlockType()`** (C API).
- Visual debugging maps each type to a specific color via **`POLY_BLOCK::ColorForPolyBlockType()`**, aiding in layout verification.

## Frequently Asked Questions

### What is the difference between PT_FLOWING_TEXT and PT_HEADING_TEXT?

**`PT_FLOWING_TEXT`** represents standard text constrained to a single column, while **`PT_HEADING_TEXT`** indicates text that spans multiple columns, such as section headers or titles. The `ColPartition` logic in [`colpartition.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/colpartition.cpp) distinguishes these based on horizontal span relative to detected column boundaries.

### How do I programmatically check if a block contains text or images?

Use the inline predicates defined in [`publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/publictypes.h). Call **`PTIsTextType(blk)`** to match all text variants (flowing, heading, pull-out, caption, vertical, and equations), or **`PTIsImageType(blk)`** to identify image regions. These functions return boolean values without requiring you to enumerate every enum constant manually.

### Can I customize how Tesseract assigns PolyBlock Types?

The type assignment is hardcoded in the **`ColPartition`** and **`POLY_BLOCK`** logic within the Tesseract source. While you cannot override the classifier through the public API without modifying the source, you can post-process the iterator results and reclassify blocks based on custom heuristics after `AnalyseLayout()` or `Recognize()` completes.

### Which block types does Tesseract actually run OCR on?

According to the source implementation, Tesseract performs OCR primarily on **`PT_FLOWING_TEXT`**, **`PT_HEADING_TEXT`**, **`PT_PULLOUT_TEXT`**, **`PT_TABLE`**, **`PT_VERTICAL_TEXT`**, and equation types. Image-only blocks (`PT_FLOWING_IMAGE`, etc.) and line blocks (`PT_HORZ_LINE`, `PT_VERT_LINE`) are excluded from text recognition and can be handled separately for image extraction or line detection tasks.