# How to Use Orientation and Script Detection (OSD) in Tesseract OCR

> Learn to use Orientation and Script Detection OSD in Tesseract OCR. Automatically detect image rotation and writing systems for improved text recognition accuracy.

- Repository: [tesseract-ocr/tesseract](https://github.com/tesseract-ocr/tesseract)
- Tags: how-to-guide
- Published: 2026-03-02

---

**Orientation and Script Detection (OSD)** is a lightweight preprocessing stage in the Tesseract OCR engine that determines the rotation angle (0°, 90°, 180°, or 270°) and dominant writing system (e.g., Latin, Cyrillic, Arabic) of text in an image without performing full character recognition.

OSD runs before the main OCR pipeline in the `tesseract-ocr/tesseract` repository, making it fast and ideal for auto-rotating scanned documents or routing multi-language content. Unlike full text recognition, OSD only requires the `osd.traineddata` model to analyze blob statistics and page geometry, operating independently of neural network recognition.


## How OSD Works in Tesseract

The OSD pipeline relies on three core components defined in **[`include/tesseract/osdetect.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/osdetect.h)**: **`OrientationDetector`**, which evaluates the rotation of each connected component; **`ScriptDetector`**, which scores possible writing systems for the chosen orientation; and **`OSResults`**, which aggregates scores across four possible orientations.

The algorithm accumulates evidence in the `OSResults` structure, storing orientation scores in `orientations[4]` and per-script scores in `scripts_na[4][kMaxNumberOfScripts]`. After processing all blobs, `OSResults::update_best_orientation()` selects the rotation with highest confidence, while `OSResults::get_best_script()` identifies the dominant script for that orientation. According to the source in **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)**, this logic executes within the **`DetectOS`** pipeline, which is automatically invoked when specific page segmentation modes are enabled.


## Page Segmentation Modes for OSD

OSD is controlled through **Page Segmentation Modes (PSM)**, defined in **[`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h)**:

```cpp
enum PageSegMode {
  PSM_OSD_ONLY        = 0,   // Orientation & script detection only
  PSM_AUTO_OSD        = 1,   // Automatic layout + OSD
  // … other modes …
};

```

The engine determines whether to run OSD via the **`PSM_OSD_ENABLED`** macro (also in [`publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/publictypes.h)), which returns true when `pageseg_mode <= PSM_AUTO_OSD`:

```cpp
inline bool PSM_OSD_ENABLED(int pageseg_mode) {
  return pageseg_mode <= PSM_AUTO_OSD || pageseg_mode == PSM_SPARSE_TEXT_OSD;
}

```

Use **`PSM_OSD_ONLY`** (value `0`) when you need only rotation and script data without text recognition. Use **`PSM_AUTO_OSD`** (value `1`) to combine automatic page layout analysis with OSD before performing OCR.


## Detecting Orientation and Script via the C++ API

The high-level entry point is **`TessBaseAPI::DetectOrientationScript`**, implemented in **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)**. This method populates an `OSResults` instance, translates the orientation index (0-3) to degrees, and looks up the human-readable script name via the `UNICHARSET`:

```cpp
bool TessBaseAPI::DetectOrientationScript(int *orient_deg,
                                          float *orient_conf,
                                          const char **script_name,
                                          float *script_conf);

```

For convenience, **`TessBaseAPI::GetOsdText`** formats these results as a plain-text report. Here is a complete example:

```cpp
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <iostream>

int main() {
  tesseract::TessBaseAPI api;
  
  // Initialize with default datapath; eng.traineddata typically includes osd data
  if (api.Init(nullptr, "eng")) {
    std::cerr << "Could not initialize tesseract.\n";
    return 1;
  }

  // Request OSD only mode
  api.SetPageSegMode(tesseract::PSM_OSD_ONLY);

  Pix *image = pixRead("sample.png");
  api.SetImage(image);

  // Retrieve formatted OSD report
  char *osd = api.GetOsdText(0);
  if (osd) {
    std::cout << osd << "\n";
    delete[] osd;
  } else {
    std::cerr << "OSD detection failed.\n";
  }

  pixDestroy(&image);
  api.End();
  return 0;
}

```

Alternatively, access raw values directly:

```cpp
int degrees;
float orient_conf, script_conf;
const char *script_name;

if (api.DetectOrientationScript(&degrees, &orient_conf, 
                                &script_name, &script_conf)) {
  std::cout << "Rotate " << degrees << " degrees, script: " 
            << script_name << "\n";
}

```


## Using the C API

The C API wrappers in **[`include/tesseract/capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h)** expose equivalent functionality through **`TessBaseAPISetPageSegMode`** and **`TessBaseAPIGetOsdText`**:

```c
#include <tesseract/capi.h>
#include <stdio.h>

int main() {
    TessBaseAPI *api = TessBaseAPICreate();
    if (TessBaseAPIInit3(api, NULL, "eng")) {
        fprintf(stderr, "Init failed\n");
        return 1;
    }

    // Switch to OSD only mode
    TessBaseAPISetPageSegMode(api, PSM_OSD_ONLY);
    TessBaseAPISetImageFile(api, "sample.png", NULL);
    
    char *osd = TessBaseAPIGetOsdText(api, 0);
    if (osd) {
        printf("%s\n", osd);
        TessDeleteText(osd);
    } else {
        fprintf(stderr, "OSD detection failed\n");
    }

    TessBaseAPIEnd(api);
    TessBaseAPIDelete(api);
    return 0;
}

```

For direct value access without string formatting, use **`TessBaseAPIDetectOrientationScript`**, which mirrors the C++ method signature.


## Command-Line OSD Usage

Invoke OSD directly from the command line using the `--psm` flag, which maps to the `PageSegMode` enum values:

```bash

# OSD only (orientation & script detection without OCR)

tesseract sample.png stdout --psm 0

# Automatic page segmentation with OSD

tesseract sample.png stdout --psm 1

```

The output for `--psm 0` resembles:

```

Page number: 0
Orientation in degrees: 90
Rotate: 1
Orientation confidence: 12.34
Script: Latin
Script confidence: 13.56

```

Rotation values indicate clockwise degrees required to make text upright: `0` (no rotation), `90`, `180`, or `270`.


## Key Implementation Files

Understanding the OSD architecture requires referencing these specific files in the `tesseract-ocr/tesseract` repository:

- **[`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h)** — Defines `PageSegMode` enum, `PSM_OSD_ONLY`, `PSM_AUTO_OSD`, and `PSM_OSD_ENABLED` logic.
- **[`include/tesseract/osdetect.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/osdetect.h)** — Declares `OrientationDetector`, `ScriptDetector`, and `OSResults` classes that perform the statistical analysis.
- **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)** — Implements `DetectOS`, `DetectOrientationScript`, and `GetOsdText`, connecting the low-level detectors to the public API (approximately lines 1540–1580 and 1640–1680).
- **[`include/tesseract/capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h)** — Exposes C-compatible functions including `TessBaseAPISetPageSegMode` and `TessBaseAPIDetectOrientationScript`.
- **`unittest/osd_test.cc`** — Contains regression tests verifying OSD accuracy across multiple scripts and rotation angles.

Note that OSD operates on the **legacy OCR engine** (as implemented in [`baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/baseapi.cpp) around line 1560), not the LSTM neural network.


## Summary

- **OSD** determines text rotation (0°/90°/180°/270°) and writing system without performing character recognition.
- Enable via **`PSM_OSD_ONLY`** (mode 0) for detection only, or **`PSM_AUTO_OSD`** (mode 1) to combine with layout analysis.
- The C++ API uses **`SetPageSegMode`** followed by **`GetOsdText`** or **`DetectOrientationScript`**.
- The C API provides **`TessBaseAPISetPageSegMode`** and **`TessBaseAPIGetOsdText`** in [`capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/capi.h).
- OSD requires only **`osd.traineddata`**, not full language models, and runs on the legacy engine.


## Frequently Asked Questions

### What is the difference between PSM_OSD_ONLY and PSM_AUTO_OSD?

**PSM_OSD_ONLY** (value 0) runs exclusively the orientation and script detection pipeline, returning only rotation and script metadata without recognizing text. **PSM_AUTO_OSD** (value 1) performs automatic page layout analysis and OSD, then continues with full OCR on the detected text regions. Both modes trigger the `DetectOS` pipeline in [`baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/baseapi.cpp).

### Does OSD require a specific trained data file?

Yes. While you must initialize Tesseract with a language (e.g., "eng" or "osd"), the `osd.traineddata` file contains the statistical models for script identification and rotation detection. Most language packs include this data, but standalone OSD usage requires at least the OSD model available in your `TESSDATA_PREFIX` directory.

### Which Tesseract engine does OSD use?

OSD runs on the **legacy OCR engine** as implemented in [`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp), not the LSTM neural network. The detection relies on blob-based statistical analysis rather than deep learning recognition, making it lightweight but dependent on the legacy engine's page layout analysis.

### How do I interpret the confidence scores from OSD?

The **orientation confidence** (`orient_conf`) and **script confidence** (`script_conf`) values represent relative statistical certainty derived from the scoring accumulators in `OSResults`. Higher values indicate stronger evidence, but these scores are not probabilities (0–1); they represent internal confidence metrics used to select the best candidate from the four possible orientations and available scripts.