# How to Implement a Custom OCR Engine in LiteParse with the OcrEngine Trait

> Learn to implement a custom OCR engine in LiteParse using the OcrEngine trait. Easily override default backends by registering your engine with LiteParse with_ocr_engine for greater control.

- Repository: [LlamaIndex/liteparse](https://github.com/run-llama/liteparse)
- Tags: how-to-guide
- Published: 2026-05-30

---

**Implement the `OcrEngine` trait defined in [`crates/liteparse/src/ocr/mod.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/mod.rs) by providing `name()` and `recognize()` methods, then register your engine via `LiteParse::with_ocr_engine` using `Arc<dyn OcrEngine>` to override the default HTTP or Tesseract backend.**

LiteParse, the document parsing library from the run-llama ecosystem, abstracts OCR functionality behind a trait-based interface that supports both native and WebAssembly targets. By implementing the `OcrEngine` trait, you can integrate custom backends—from proprietary cloud APIs to on-device machine learning models—while preserving the library's core parsing pipeline and configuration system.

## Understanding the OcrEngine Trait

The `OcrEngine` trait is defined in **[`crates/liteparse/src/ocr/mod.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/mod.rs)** and serves as the abstraction layer for all OCR operations. The trait requires `Send + Sync` bounds, but the future returned by `recognize` has platform-specific constraints:

- On native targets (`#[cfg(not(target_arch = "wasm32"))]`), the future must be `Send` to allow the async runtime to move it across threads.
- On WebAssembly, the trait remains `Send + Sync`, but the future does not require the `Send` bound because the runtime is single-threaded.

```rust
// crates/liteparse/src/ocr/mod.rs
pub trait OcrEngine: Send + Sync {
    fn name(&self) -> &str;
    fn recognize<'a, 'b: 'a, 'c: 'a>(
        &'a self,
        image_data: &'c [u8],
        width: u32,
        height: u32,
        options: &'b OcrOptions,
    ) -> Pin<
        Box<
            dyn Future<
                Output = Result<Vec<OcrResult>, Box<dyn std::error::Error + Send + Sync>>
            > + Send + '_,
        >,
    >;
}

```

The `recognize` method receives raw image bytes (typically PNG), pixel dimensions, and `OcrOptions`, returning a pinned future that resolves to a `Vec<OcrResult>` or a boxed error.

## Step-by-Step Implementation Guide

Follow these steps to create a production-ready custom OCR engine in LiteParse:

1. **Create a module** inside `crates/liteparse/src/ocr/` (e.g., [`my_ocr.rs`](https://github.com/run-llama/liteparse/blob/main/my_ocr.rs)) and import `OcrEngine`, `OcrOptions`, and `OcrResult`.
2. **Define a struct** holding your engine's configuration, such as API endpoints or model handles.
3. **Implement `OcrEngine`** for your struct, providing:
   - **`name()`**: Returns a static string identifier for logging.
   - **`recognize()`**: Returns a pinned future containing your async OCR logic.
4. **Ensure thread safety** by using `Send + Sync` types for native targets; wrap mutable state in `Arc<Mutex<T>>` if necessary.
5. **Expose a constructor** (typically `new()`) that initializes the engine.

## Complete Custom Engine Example

Here is a minimal "Echo" engine implementation that demonstrates the trait contract without external dependencies:

```rust
// crates/liteparse/src/ocr/my_ocr.rs
use super::{OcrEngine, OcrOptions, OcrResult};
use std::future::Future;
use std::pin::Pin;

/// A very simple "echo" OCR engine used for demonstration.
pub struct EchoEngine;

impl EchoEngine {
    pub fn new() -> Self {
        EchoEngine
    }
}

impl OcrEngine for EchoEngine {
    fn name(&self) -> &str {
        "echo"
    }

    fn recognize<'a, 'b: 'a, 'c: 'a>(
        &'a self,
        _image_data: &'c [u8],
        _width: u32,
        _height: u32,
        options: &'b OcrOptions,
    ) -> Pin<Box<dyn Future<Output = Result<Vec<OcrResult>, Box<dyn std::error::Error + Send + Sync>>> + Send + '_>>
    {
        // In a real engine you would send the image to an OCR service here.
        // This placeholder just returns a single word containing the requested language.
        Box::pin(async move {
            Ok(vec![OcrResult {
                text: format!("language={}", options.language),
                bbox: [0.0, 0.0, 100.0, 20.0],
                confidence: 1.0,
            }])
        })
    }
}

```

This example returns a single result containing the requested language code, illustrating the expected return format: `text` (extracted string), `bbox` ([x1, y1, x2, y2] in pixel coordinates), and `confidence` (0.0 to 1.0).

## Wiring Your Engine into LiteParse

The `LiteParse` struct stores the custom engine in the `ocr_engine_override` field and selects it inside `parse_input` (see **[`crates/liteparse/src/parser.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/parser.rs)**, lines 43-57 and 71-78). Use the `with_ocr_engine` method to inject your implementation:

```rust
use liteparse::parser::LiteParse;
use liteparse::ocr::my_ocr::EchoEngine;
use std::sync::Arc;

// Build the standard configuration
let cfg = liteparse::config::LiteParseConfig::default();

// Create the parser and inject the custom engine
let parser = LiteParse::new(cfg)
    .with_ocr_engine(Arc::new(EchoEngine::new()));

// Now `parser.parse_input(...)` will use `EchoEngine` instead of HTTP/Tesseract.

```

When `ocr_engine_override` is `Some(Arc<dyn OcrEngine>)`, LiteParse bypasses its default selection logic—which normally chooses between HTTP OCR servers and the built-in Tesseract engine—and delegates all OCR operations to your implementation.

## Platform-Specific Considerations

When implementing `OcrEngine`, account for these architectural requirements:

- **Thread Safety**: On native platforms, ensure your engine and returned futures are `Send + Sync`. Use thread-safe HTTP clients like `reqwest::Client` or protect mutable state with `Mutex`/`RwLock`.
- **Future Bounds**: The returned future must include `+ Send` for native builds. The `#[cfg(target_arch = "wasm32")]` implementation in LiteParse drops this bound, allowing the same code to compile for both targets.
- **Bounding Box Format**: The `bbox` field expects `[x1, y1, x2, y2]` in pixel coordinates matching the rendered page image dimensions passed to `recognize`.
- **Error Handling**: Propagate failures as `Box<dyn std::error::Error + Send + Sync>`, which LiteParse surfaces as a `LiteParseError`.
- **Language Option**: The `OcrOptions` struct exposes the `language` string from `LiteParseConfig`. Respect this parameter when constructing requests to multilingual OCR backends.

## Key Reference Files

Study these existing engines in the run-llama/liteparse repository for production patterns:

- **[`crates/liteparse/src/ocr/tesseract.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/tesseract.rs)**: Built-in Tesseract wrapper demonstrating local model execution.
- **[`crates/liteparse/src/ocr/http_simple.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/http_simple.rs)**: HTTP client implementation showing async request handling and error mapping.
- **[`crates/liteparse/src/ocr/mod.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/mod.rs)**: Trait definition and `OcrResult`/`OcrOptions` struct specifications.
- **[`crates/liteparse/src/parser.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/parser.rs)**: Integration point showing how `with_ocr_engine` stores the override at lines 43-57 and how `parse_input` selects the engine at lines 71-78.

## Summary

- **Implement** the `OcrEngine` trait from [`crates/liteparse/src/ocr/mod.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/mod.rs) with `name()` and `recognize()` methods.
- **Return** a `Pin<Box<dyn Future<...>>>` with `Send` bounds for native targets, satisfying the exact lifetime constraints `'b: 'a` and `'c: 'a`.
- **Register** your engine via `LiteParse::with_ocr_engine(Arc::new(your_engine))` to override default backends.
- **Ensure** thread safety using `Send + Sync` bounds and thread-safe containers for stateful clients.
- **Format** bounding boxes as `[x1, y1, x2, y2]` pixel coordinates and respect the `language` field in `OcrOptions`.

## Frequently Asked Questions

### What is the exact method signature required for the recognize method?

The `recognize` method must match the signature in [`crates/liteparse/src/ocr/mod.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/ocr/mod.rs), including the lifetime bounds `'b: 'a` and `'c: 'a` to ensure the returned future does not outlive the borrowed `image_data` and `options` references. It returns `Pin<Box<dyn Future<Output = Result<Vec<OcrResult>, Box<dyn std::error::Error + Send + Sync>>> + Send + '_>>` on native platforms, with the `Send` bound on the future dropped for WebAssembly builds.

### How does LiteParse select between the default engine and my custom implementation?

LiteParse checks the `ocr_engine_override` field (defined in [`crates/liteparse/src/parser.rs`](https://github.com/run-llama/liteparse/blob/main/crates/liteparse/src/parser.rs) at lines 43-57) inside `parse_input`. If `Some(Arc<dyn OcrEngine>)` is present, it uses that engine exclusively. If `None`, it falls back to built-in logic selecting between HTTP OCR servers and the Tesseract engine based on configuration flags.

### Can I use async HTTP clients like reqwest in my custom OCR engine?

Yes. Wrap your `reqwest::Client` in your engine struct, ensuring the client is `Send + Sync`. Since `recognize` returns a pinned future, you can `.await` async HTTP calls inside the async block while satisfying the trait's thread-safety requirements for native targets.

### What image format does the recognize method receive?

The `image_data` parameter contains raw bytes of the rendered page image, typically in PNG format. The `width` and `height` parameters correspond to the pixel dimensions of this image data, allowing you to pass the exact specifications to your OCR backend or perform pre-processing if necessary.