# How to Use Orientation and Script Detection (OSD) in Tesseract OCR > Learn to use Orientation and Script Detection OSD in Tesseract OCR. Automatically detect image rotation and writing systems for improved text recognition accuracy. - Repository: [tesseract-ocr/tesseract](https://github.com/tesseract-ocr/tesseract) - Tags: how-to-guide - Published: 2026-03-02 --- **Orientation and Script Detection (OSD)** is a lightweight preprocessing stage in the Tesseract OCR engine that determines the rotation angle (0°, 90°, 180°, or 270°) and dominant writing system (e.g., Latin, Cyrillic, Arabic) of text in an image without performing full character recognition. OSD runs before the main OCR pipeline in the `tesseract-ocr/tesseract` repository, making it fast and ideal for auto-rotating scanned documents or routing multi-language content. Unlike full text recognition, OSD only requires the `osd.traineddata` model to analyze blob statistics and page geometry, operating independently of neural network recognition. ## How OSD Works in Tesseract The OSD pipeline relies on three core components defined in **[`include/tesseract/osdetect.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/osdetect.h)**: **`OrientationDetector`**, which evaluates the rotation of each connected component; **`ScriptDetector`**, which scores possible writing systems for the chosen orientation; and **`OSResults`**, which aggregates scores across four possible orientations. The algorithm accumulates evidence in the `OSResults` structure, storing orientation scores in `orientations[4]` and per-script scores in `scripts_na[4][kMaxNumberOfScripts]`. After processing all blobs, `OSResults::update_best_orientation()` selects the rotation with highest confidence, while `OSResults::get_best_script()` identifies the dominant script for that orientation. According to the source in **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)**, this logic executes within the **`DetectOS`** pipeline, which is automatically invoked when specific page segmentation modes are enabled. ## Page Segmentation Modes for OSD OSD is controlled through **Page Segmentation Modes (PSM)**, defined in **[`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h)**: ```cpp enum PageSegMode { PSM_OSD_ONLY = 0, // Orientation & script detection only PSM_AUTO_OSD = 1, // Automatic layout + OSD // … other modes … }; ``` The engine determines whether to run OSD via the **`PSM_OSD_ENABLED`** macro (also in [`publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/publictypes.h)), which returns true when `pageseg_mode <= PSM_AUTO_OSD`: ```cpp inline bool PSM_OSD_ENABLED(int pageseg_mode) { return pageseg_mode <= PSM_AUTO_OSD || pageseg_mode == PSM_SPARSE_TEXT_OSD; } ``` Use **`PSM_OSD_ONLY`** (value `0`) when you need only rotation and script data without text recognition. Use **`PSM_AUTO_OSD`** (value `1`) to combine automatic page layout analysis with OSD before performing OCR. ## Detecting Orientation and Script via the C++ API The high-level entry point is **`TessBaseAPI::DetectOrientationScript`**, implemented in **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)**. This method populates an `OSResults` instance, translates the orientation index (0-3) to degrees, and looks up the human-readable script name via the `UNICHARSET`: ```cpp bool TessBaseAPI::DetectOrientationScript(int *orient_deg, float *orient_conf, const char **script_name, float *script_conf); ``` For convenience, **`TessBaseAPI::GetOsdText`** formats these results as a plain-text report. Here is a complete example: ```cpp #include #include #include int main() { tesseract::TessBaseAPI api; // Initialize with default datapath; eng.traineddata typically includes osd data if (api.Init(nullptr, "eng")) { std::cerr << "Could not initialize tesseract.\n"; return 1; } // Request OSD only mode api.SetPageSegMode(tesseract::PSM_OSD_ONLY); Pix *image = pixRead("sample.png"); api.SetImage(image); // Retrieve formatted OSD report char *osd = api.GetOsdText(0); if (osd) { std::cout << osd << "\n"; delete[] osd; } else { std::cerr << "OSD detection failed.\n"; } pixDestroy(&image); api.End(); return 0; } ``` Alternatively, access raw values directly: ```cpp int degrees; float orient_conf, script_conf; const char *script_name; if (api.DetectOrientationScript(°rees, &orient_conf, &script_name, &script_conf)) { std::cout << "Rotate " << degrees << " degrees, script: " << script_name << "\n"; } ``` ## Using the C API The C API wrappers in **[`include/tesseract/capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h)** expose equivalent functionality through **`TessBaseAPISetPageSegMode`** and **`TessBaseAPIGetOsdText`**: ```c #include #include int main() { TessBaseAPI *api = TessBaseAPICreate(); if (TessBaseAPIInit3(api, NULL, "eng")) { fprintf(stderr, "Init failed\n"); return 1; } // Switch to OSD only mode TessBaseAPISetPageSegMode(api, PSM_OSD_ONLY); TessBaseAPISetImageFile(api, "sample.png", NULL); char *osd = TessBaseAPIGetOsdText(api, 0); if (osd) { printf("%s\n", osd); TessDeleteText(osd); } else { fprintf(stderr, "OSD detection failed\n"); } TessBaseAPIEnd(api); TessBaseAPIDelete(api); return 0; } ``` For direct value access without string formatting, use **`TessBaseAPIDetectOrientationScript`**, which mirrors the C++ method signature. ## Command-Line OSD Usage Invoke OSD directly from the command line using the `--psm` flag, which maps to the `PageSegMode` enum values: ```bash # OSD only (orientation & script detection without OCR) tesseract sample.png stdout --psm 0 # Automatic page segmentation with OSD tesseract sample.png stdout --psm 1 ``` The output for `--psm 0` resembles: ``` Page number: 0 Orientation in degrees: 90 Rotate: 1 Orientation confidence: 12.34 Script: Latin Script confidence: 13.56 ``` Rotation values indicate clockwise degrees required to make text upright: `0` (no rotation), `90`, `180`, or `270`. ## Key Implementation Files Understanding the OSD architecture requires referencing these specific files in the `tesseract-ocr/tesseract` repository: - **[`include/tesseract/publictypes.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/publictypes.h)** — Defines `PageSegMode` enum, `PSM_OSD_ONLY`, `PSM_AUTO_OSD`, and `PSM_OSD_ENABLED` logic. - **[`include/tesseract/osdetect.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/osdetect.h)** — Declares `OrientationDetector`, `ScriptDetector`, and `OSResults` classes that perform the statistical analysis. - **[`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp)** — Implements `DetectOS`, `DetectOrientationScript`, and `GetOsdText`, connecting the low-level detectors to the public API (approximately lines 1540–1580 and 1640–1680). - **[`include/tesseract/capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h)** — Exposes C-compatible functions including `TessBaseAPISetPageSegMode` and `TessBaseAPIDetectOrientationScript`. - **`unittest/osd_test.cc`** — Contains regression tests verifying OSD accuracy across multiple scripts and rotation angles. Note that OSD operates on the **legacy OCR engine** (as implemented in [`baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/baseapi.cpp) around line 1560), not the LSTM neural network. ## Summary - **OSD** determines text rotation (0°/90°/180°/270°) and writing system without performing character recognition. - Enable via **`PSM_OSD_ONLY`** (mode 0) for detection only, or **`PSM_AUTO_OSD`** (mode 1) to combine with layout analysis. - The C++ API uses **`SetPageSegMode`** followed by **`GetOsdText`** or **`DetectOrientationScript`**. - The C API provides **`TessBaseAPISetPageSegMode`** and **`TessBaseAPIGetOsdText`** in [`capi.h`](https://github.com/tesseract-ocr/tesseract/blob/main/capi.h). - OSD requires only **`osd.traineddata`**, not full language models, and runs on the legacy engine. ## Frequently Asked Questions ### What is the difference between PSM_OSD_ONLY and PSM_AUTO_OSD? **PSM_OSD_ONLY** (value 0) runs exclusively the orientation and script detection pipeline, returning only rotation and script metadata without recognizing text. **PSM_AUTO_OSD** (value 1) performs automatic page layout analysis and OSD, then continues with full OCR on the detected text regions. Both modes trigger the `DetectOS` pipeline in [`baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/baseapi.cpp). ### Does OSD require a specific trained data file? Yes. While you must initialize Tesseract with a language (e.g., "eng" or "osd"), the `osd.traineddata` file contains the statistical models for script identification and rotation detection. Most language packs include this data, but standalone OSD usage requires at least the OSD model available in your `TESSDATA_PREFIX` directory. ### Which Tesseract engine does OSD use? OSD runs on the **legacy OCR engine** as implemented in [`src/api/baseapi.cpp`](https://github.com/tesseract-ocr/tesseract/blob/main/src/api/baseapi.cpp), not the LSTM neural network. The detection relies on blob-based statistical analysis rather than deep learning recognition, making it lightweight but dependent on the legacy engine's page layout analysis. ### How do I interpret the confidence scores from OSD? The **orientation confidence** (`orient_conf`) and **script confidence** (`script_conf`) values represent relative statistical certainty derived from the scoring accumulators in `OSResults`. Higher values indicate stronger evidence, but these scores are not probabilities (0–1); they represent internal confidence metrics used to select the best candidate from the four possible orientations and available scripts.