How to Use the document_exporter Command in Paperless-ngx: Complete Backup and Migration Guide

The document_exporter command creates a complete, portable backup of your Paperless-ngx installation by serializing database models, copying original documents and thumbnails, and generating a JSON manifest that can be restored later.

The document_exporter management command in Paperless-ngx is a stream-oriented backup tool designed for full or incremental exports of your document archive. Located in src/documents/management/commands/document_exporter.py, it extracts data from the Django ORM using batched queries to minimize memory usage, handles file operations with optional checksum verification, and supports inline encryption of sensitive fields. Whether you are migrating to a new server or implementing automated backups, understanding how to use this command ensures your documents remain portable and secure.

Architecture and Export Workflow

The exporter follows a deliberate stream-oriented architecture to handle large archives without exhausting system memory. According to the source code in document_exporter.py, the process executes seven distinct phases:

  1. Argument Parsing – The Command.add_arguments method (lines 77-124) registers flags including -c, -d, -f, -z, and --passphrase that control export behavior.
  2. Target Preparation – The handle method (lines 31-45) creates a temporary directory when --zip is specified or validates the user-supplied target folder.
  3. Snapshot Creation – The exporter indexes existing files in the target directory to support incremental updates and deletion of stale files.
  4. Manifest Building – Database models are serialized using serialize_queryset_batched (lines 70-84) to process records in chunks, with sensitive fields encrypted via _encrypt_record_inline (lines 777-887) when a passphrase is provided.
  5. File Export – For each document, generate_base_name (lines 549-666) and generate_document_targets (lines 770-801) determine output paths, while check_and_copy (lines 655-889) performs optimized file copying.
  6. Split Manifest – When --split-manifest is used, _write_split_manifest (lines 889-915) creates individual JSON files for document notes and custom fields.
  7. Cleanupcheck_and_write_json (lines 216-255) writes the metadata.json version file, and optional deletion logic removes untracked files when -d is specified (lines 531-549).

The StreamingManifestWriter writes the manifest incrementally to a temporary .tmp file, compares its hash against existing manifests when --compare-json is used, and only replaces the target file when changes occur. This prevents unnecessary timestamp updates that would break incremental backup systems.

Command Line Arguments and Options

The document_exporter command accepts numerous flags that control what data gets exported and how files are handled. Key parameters include:

  • -c or --compare-checksums – Computes checksums for source files and only copies when the checksum differs from the existing target.
  • -d or --delete – Removes files in the target directory that were not produced by the current export, cleaning up documents deleted from the database.
  • -f or --use-filename-format – Applies the PAPERLESS_FILENAME_FORMAT setting to organize exported files.
  • -na or --no-archive – Excludes archive PDFs from the export to reduce size.
  • -nt or --no-thumbnail – Excludes thumbnail images from the export.
  • -z or --zip – Creates a ZIP archive instead of a folder structure.
  • --passphrase – Encrypts sensitive fields (mail passwords, OAuth tokens) defined in CryptMixin.CRYPT_FIELDS_BY_MODEL.
  • --split-manifest – Generates separate JSON manifests for each document's notes and custom fields.

Practical Usage Examples

Basic Export to a Local Folder

Run a complete export to a mounted directory accessible from the host:

docker compose exec -T webserver document_exporter ../export

The ../export path resolves to a directory mounted from the host, making the backup available outside the container.

Incremental Backup with Checksum Verification

Avoid copying unchanged files by enabling checksum comparison:

docker compose exec -T webserver document_exporter ../export -c

The -c flag forces the check_and_copy method to compute file checksums and skip identical files, optimizing network and storage usage.

Export to ZIP Archive

Create a compressed archive with a custom name:

docker compose exec -T webserver document_exporter ../export -z -zn backup-$(date +%Y-%m-%d)

The -z flag triggers ZIP creation, while -zn specifies the archive filename.

Exclude Thumbnails and Archives

Generate a minimal backup containing only original documents and metadata:

docker compose exec -T webserver document_exporter ../export -nt -na

The -nt flag skips thumbnails and -na skips archive PDFs, significantly reducing export size.

Apply Filename Format Settings

Organize exported files using your configured naming scheme:

docker compose exec -T webserver document_exporter ../export -f

When -f is specified, the exporter calls generate_base_name to respect the PAPERLESS_FILENAME_FORMAT configuration variable.

Encrypted Export with Passphrase

Protect sensitive configuration data during export:

docker compose exec -T webserver document_exporter ../export --passphrase "SecureBackupPassphrase2024"

This encrypts fields identified in CryptMixin.CRYPT_FIELDS_BY_MODEL within the JSON manifest. Store this passphrase securely—it is required for successful import via the document importer.

Delete Stale Files

Synchronize the export directory with the current database state:

docker compose exec -T webserver document_exporter ../export -d

The -d flag traverses the target directory after export and removes any files not referenced by the current manifest, effectively mirroring deletions made in the Paperless interface.

Implementation Details and Source Files

Several modules contribute to the exporter's functionality:

The exporter deliberately uses batched serialization via serialize_queryset_batched to maintain low memory footprints when handling archives containing millions of documents.

Summary

  • The document_exporter command creates portable, complete backups by serializing database records and copying files from PAPERLESS_MEDIA_ROOT.
  • StreamingManifestWriter enables efficient incremental exports by writing to temporary files and comparing hashes before replacement.
  • Use -c for checksum-based incremental backups and -d to remove stale files from the target directory.
  • Encryption via --passphrase protects sensitive credentials stored in the database but requires the same passphrase for restoration.
  • The command is exposed as a standalone executable in Docker containers at /usr/local/bin/document_exporter, wrapping the Django management command.

Frequently Asked Questions

What file formats does the document_exporter create?

The exporter generates JSON manifest files describing database models, copies original documents in their native formats (PDF, images, etc.), and includes thumbnail images and archive PDFs unless excluded with -nt or -na. When --zip is used, these files are packaged into a single ZIP archive; otherwise, they maintain a folder structure mirroring the database organization.

Can I use document_exporter for incremental backups?

Yes. Use the -c (--compare-checksums) flag to copy only files that have changed based on checksum comparison, and --compare-json to skip rewriting manifest files that haven't changed. For automated incremental backups, combine these with -d (--delete) to remove files belonging to documents deleted from the database, ensuring the export mirrors the current state exactly.

How do I restore from a document_exporter backup?

Restore using the complementary document_importer command, which reads the JSON manifest files created by the exporter. If you used encryption (--passphrase), you must provide the identical passphrase during import. The importer recreates database records and copies files back into the Paperless-ngx media directory while preserving metadata, tags, and custom fields.

Why are some files missing from my export?

Files may be excluded intentionally via -nt (no thumbnails) or -na (no archives), or the exporter may have skipped them due to checksum matching when using -c and the files already exist in the target. Additionally, if using --data-only, no document files are exported at all—only the database JSON manifests. Check your command flags and verify the export directory permissions if files appear missing unexpectedly.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →