How to Parse Password-Protected PDFs with LiteParse: A Complete Guide
Supply the password via the password field in LiteParseConfig when initializing the parser, and LiteParse automatically decrypts the PDF using PDFium before text extraction.
Parsing encrypted PDFs in the run-llama/liteparse repository requires no special handling beyond providing the decryption password during initialization. Whether you use the Rust CLI, Node.js SDK, Python bindings, or WebAssembly module, the password configuration field triggers an internal flow that securely hands the credential to PDFium for document unlocking.
Configuration-Based Password Supply
The LiteParseConfig Structure
At the core of password handling is the configuration struct defined in crates/liteparse/src/config.rs (lines 24-26):
pub struct LiteParseConfig {
pub password: Option<String>,
// ... other fields
}
This Option<String> field acts as the single source of truth for decryption credentials across every language binding and interface.
CLI and Binding Interfaces
All front-ends expose this configuration consistently. The Rust binary implements a --password flag in crates/liteparse/src/main.rs (lines 81-84). High-level bindings map this to idiomatic constructors in their respective entry points:
- Node.js: The
passwordfield in the options object (packages/node/src/lib.ts, lines 27-33) - Python: A
passwordconstructor argument (packages/python/liteparse/parser.py, lines 76-94) - WebAssembly: Mirrored config struct with password support (
crates/liteparse-wasm/src/lib.rs, lines 49-53)
Language-Specific Implementation Examples
Rust CLI
Invoke the binary with the --password flag:
liteparse parse my-protected.pdf --password secret123
Node.js
Pass the password in the configuration object during instantiation:
import { LiteParse } from "liteparse";
const parser = new LiteParse({
password: "secret123", // Unlocks encrypted PDFs
ocrEnabled: false,
});
const result = await parser.parse("my-protected.pdf");
console.log(result.text);
Python
Supply the password as a constructor argument:
from liteparse import LiteParse
parser = LiteParse(password="secret123")
result = parser.parse("my-protected.pdf")
print(result.text)
WebAssembly (Browser)
The WASM binding accepts the password identically via the configuration object:
import { LiteParse } from "liteparse-wasm";
const parser = new LiteParse({ password: "secret123" });
const data = await fetch("protected.pdf").then(r => r.arrayBuffer());
const result = await parser.parse(data);
console.log(result.text);
Internal Password Propagation Architecture
Understanding how the credential flows from your code to the PDF engine ensures proper security handling and debugging capabilities.
Parser Initialization and Config Reading
When LiteParse::parse_input executes, it reads self.config.password.as_deref() and passes the value to the extraction stage. This propagation occurs in crates/liteparse/src/parser.rs at lines 84-87 and 101-103, ensuring the password reaches every layer that touches the document.
PDFium Document Loading
The low-level extraction logic in crates/liteparse/src/extract.rs (lines 5-14) receives the password and forwards it to pdfium::Library via load_document or load_document_from_bytes. PDFium handles decryption internally, returning decrypted page content to LiteParse's standard text extraction or OCR pipeline.
Non-PDF Conversion Handling
LiteParse supports password-protected Office documents by forwarding credentials during format conversion. In crates/liteparse/src/conversion.rs (lines 96-110 and 188-190), conversion helpers pass the password to LibreOffice before rendering to PDF, ensuring the source document unlocks prior to processing.
Summary
- LiteParseConfig stores the password as
Option<String>incrates/liteparse/src/config.rs - All bindings expose the password field: Rust CLI uses
--password, while Node.js, Python, and WASM use constructor configuration objects - Internal flow: The parser passes the password to PDFium's
load_documentmethods incrates/liteparse/src/extract.rs - Office documents: The conversion layer in
crates/liteparse/src/conversion.rsforwards passwords to LibreOffice for pre-conversion unlocking
Frequently Asked Questions
How does the password reach the PDFium engine?
The password travels from LiteParseConfig through parse_input in crates/liteparse/src/parser.rs (lines 84-87), then to the extraction layer at crates/liteparse/src/extract.rs (lines 5-14), where it is passed as a parameter to PDFium's load_document or load_document_from_bytes methods. PDFium handles the actual decryption internally using the PDF standard security handler.
Can LiteParse handle password-protected Word or Excel files?
Yes. When processing non-PDF inputs, LiteParse forwards the password to conversion helpers in crates/liteparse/src/conversion.rs (lines 96-110 and 188-190). This allows LibreOffice to unlock Office documents before converting them to PDF for subsequent text extraction.
Is the password stored after the document is parsed?
No. The password exists only as a transient field in the LiteParseConfig struct during the active parsing session. It is passed to PDFium for immediate decryption and is not persisted in output structures, cached, or logged during the extraction process.
What happens if I provide an incorrect password?
PDFium will fail to decrypt the document during the load_document call in crates/liteparse/src/extract.rs, causing the parser to return an authentication error before any text extraction begins. LiteParse does not implement custom decryption logic that could bypass PDFium's security checks.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →