# How to Debug Tesseract Using TessEdit Pageseg Mode and Variable Inspection

> Debug Tesseract effectively by using TessEdit pageseg mode and variable inspection. Learn to isolate layout stages and inspect runtime configurations for faster issue resolution.

- Repository: [tesseract-ocr/tesseract](https://github.com/tesseract-ocr/tesseract)
- Tags: how-to-guide
- Published: 2026-03-02

---

**To debug Tesseract, set `tessedit_pageseg_mode` to isolate specific layout stages, enable debug flags like `textord_debug`, and use `SetVariable`/`GetVariableAsString` or `--print-parameters` to inspect runtime configuration.**

The **tesseract-ocr/tesseract** repository provides extensive runtime parameter inspection capabilities that allow developers to isolate layout-related bugs without modifying source code. By manipulating the **TessEdit page segmentation mode** and inspecting internal variables, you can force the OCR pipeline to execute only specific stages while capturing detailed diagnostic output. This approach leverages the parameter-to-algorithm bridge implemented across [`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp) and `src/ccmain/` to debug segmentation behavior systematically.

## Architecture of Tesseract's Debug Parameter System

Tesseract's debugging infrastructure connects command-line options and API calls to internal algorithm behavior through a centralized parameter system.

**Key source files forming this bridge include:**

- **[`src/ccmain/tesseractclass.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccmain/tesseractclass.cpp)** (lines 77-82): Declares integer parameters including `tessedit_pageseg_mode` and debug flags like `textord_debug_block`
- **[`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h)** (lines 157-176): Defines the `PageSegMode` enum (`PSM_SINGLE_BLOCK`, `PSM_SINGLE_LINE`, `PSM_SINGLE_WORD`, etc.) and helper macros like `PSM_OSD_ENABLED`
- **[`src/ccmain/pagesegmain.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccmain/pagesegmain.cpp)** (lines 96-108): Retrieves `tessedit_pageseg_mode` at runtime and branches to appropriate layout algorithms
- **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)**: Implements `SetVariable` (lines 207-213), `GetVariableAsString` (lines 259-261), and `PrintVariables` (lines 84-86) that forward to `ParamUtils`
- **[`include/tesseract/capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h)** (lines 296-302): Exposes C wrappers including `TessBaseAPISetVariable` and `TessBaseAPIGetVariableAsString`

When you call `SetVariable`, the implementation stores values in the `Tesseract` instance, which processing code queries during layout analysis, text recognition, and output generation.

## Setting Page Segmentation Mode for Isolated Debugging

The **`tessedit_pageseg_mode`** variable controls which layout analysis algorithms execute, allowing you to bypass stages unrelated to your bug.

### Using the Command Line

The `-c` flag maps directly to `TessBaseAPI::SetVariable`:

```bash
tesseract input.png out \
  -c tessedit_pageseg_mode=7 \
  -c textord_debug=2 \
  -c debug_file=debug.txt

```

This configuration forces **PSM_SINGLE_LINE** mode (value 7), enables moderate layout debugging, and redirects all `tprintf` output to [`debug.txt`](https://github.com/tesseract-ocr/tesseract/blob/main/debug.txt).

### Using the C++ API

```cpp
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>

tesseract::TessBaseAPI api;
api.Init("/usr/local/share/tessdata", "eng");

// Force single-line segmentation
api.SetVariable("tessedit_pageseg_mode", "7");

// Enable verbose layout debugging
api.SetVariable("textord_debug", "3");
api.SetVariable("debug_file", "layout_debug.txt");

Pix *image = pixRead("input.png");
api.SetImage(image);
char *utf8 = api.GetUTF8Text();

```

The `SetVariable` method accepts string arguments and forwards them to `ParamUtils`, converting values to appropriate types internally.

### Using the C API

```c
TessBaseAPI *handler = TessBaseAPICreate();
TessBaseAPIInit3(handler, "/usr/local/share/tessdata", "eng");

TessBaseAPISetVariable(handler, "tessedit_pageseg_mode", "7");
TessBaseAPISetVariable(handler, "textord_debug", "3");
TessBaseAPISetVariable(handler, "debug_file", "debug.txt");

// Process image...

```

The C wrapper `TessBaseAPISetVariable` invokes the same underlying C++ routine defined in [`baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/baseapi.cpp).

## Inspecting Variables at Runtime

After setting parameters, verify the effective configuration using Tesseract's introspection API.

### Dumping All Parameters

List every available parameter and current value:

```bash
tesseract --print-parameters > all_params.txt

```

This CLI option calls `TessBaseAPI::PrintVariables`, which iterates over `tesseract_->params()` and outputs the complete parameter table to a `FILE*` stream.

### Querying Specific Variables

Retrieve individual values programmatically to confirm they were honored:

```cpp
std::string val;
if (api.GetVariableAsString("textord_debug", &val)) {
    std::cout << "textord_debug = " << val << '\n';
}

int mode;
if (api.GetIntVariable("tessedit_pageseg_mode", &mode)) {
    std::cout << "Effective mode: " << mode << std::endl;
}

```

`GetVariableAsString` (lines 259-261 in [`baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/baseapi.cpp)) calls `ParamUtils::GetParamAsString` to convert internal representations to readable text.

### Capturing Debug Output to File

Set the **`debug_file`** parameter to capture all `tprintf` statements—including those inside layout stages—without opening GUI windows:

```cpp
api.SetVariable("debug_file", "tesseract_debug.log");

```

When combined with `textord_debug > 0`, this writes intermediate visualizations and textual logs to disk, creating low-overhead diagnostics suitable for headless environments.

## End-to-End Debugging Workflow

Follow this systematic approach to isolate layout-related issues using TessEdit pageseg mode and variable inspection:

1. **Select an isolation mode** that restricts processing to your suspected problematic stage:
   - **PSM_SINGLE_LINE** (7): Skip column detection
   - **PSM_SINGLE_WORD** (8): Isolate word-level recognition
   - **PSM_SINGLE_BLOCK** (6): Process individual text blocks only

2. **Enable targeted debug flags** (`textord_debug`, `wordrec_debug`, `tessedit_debug`) with verbosity levels 1-5 to generate intermediate images saved as `*.debug.png`

3. **Execute OCR** via CLI or API with `debug_file` set to capture the complete log

4. **Verify parameter application** using `GetVariableAsString` or `--print-parameters` to confirm the effective `tessedit_pageseg_mode` matches your intent

5. **Analyze outputs**: Review the textual debug log and generated PNG overlays to identify where pipeline behavior diverges from expectations

6. **Iterate**: Adjust the pageseg mode or increase debug verbosity (`textord_debug=5` for maximum detail) until the root cause is exposed

## Summary

- **Set `tessedit_pageseg_mode`** via `-c` CLI flags, C++ `SetVariable`, or C `TessBaseAPISetVariable` to isolate specific layout stages (single line, word, or block)
- **Enable diagnostics** using `textord_debug`, `wordrec_debug`, and `tessedit_debug` variables with values 1-5 to control verbosity
- **Capture output** by setting `debug_file` to route all `tprintf` statements to disk without GUI dependencies
- **Inspect configuration** at runtime using `GetVariableAsString`, `GetIntVariable`, or the `--print-parameters` CLI option
- **Reference implementation details** in [`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp) for the API layer and [`src/ccmain/pagesegmain.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccmain/pagesegmain.cpp) for mode-specific algorithm selection

## Frequently Asked Questions

### How do I find the integer values for different page segmentation modes?

The integer values are defined in [`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h) (lines 157-176) within the `PageSegMode` enum. Common values include **0** (PSM_OSD_ONLY), **4** (PSM_SINGLE_COLUMN), **6** (PSM_SINGLE_BLOCK), **7** (PSM_SINGLE_LINE), and **8** (PSM_SINGLE_WORD). You can also retrieve the complete list by running `tesseract --print-parameters` and searching for `tessedit_pageseg_mode`.

### What is the difference between `textord_debug` and `tessedit_debug`?

**`textord_debug`** controls output from the text ordering and layout analysis stages (lines, blocks, reading order), while **`tessedit_debug`** governs general OCR engine diagnostics including recognition and output formatting. Set `textord_debug` when investigating column detection or line segmentation issues; use `tessedit_debug` for post-recognition processing analysis. Both accept integer values 0-5, where 0 disables output and 5 provides maximum verbosity.

### Can I change the page segmentation mode after calling `SetImage`?

Yes, you can modify `tessedit_pageseg_mode` via `SetVariable` after `SetImage`, but the change only takes effect before `Recognize()` or `GetUTF8Text()` is called. According to the implementation in [`src/ccmain/pagesegmain.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/ccmain/pagesegmain.cpp), the mode is read at the beginning of the layout analysis phase. For predictable results, set this variable immediately after `Init()` and before `SetImage()`.

### Where does Tesseract save debug images when `textord_debug` is enabled?

When `textord_debug` is set to a non-zero value and `debug_file` is specified, Tesseract writes intermediate visualization images to the same directory as the input file or current working directory, using filenames derived from the input image with `.debug.png` suffixes. The exact naming convention depends on the specific layout stage being debugged, as implemented in the various `textord` modules in `src/textord/`.