How to Use the document_exporter Command in Paperless-ngx: Complete Backup and Migration Guide
The document_exporter command creates a complete, portable backup of your Paperless-ngx installation by serializing database models, copying original documents and thumbnails, and generating a JSON manifest that can be restored later.
The document_exporter management command in Paperless-ngx is a stream-oriented backup tool designed for full or incremental exports of your document archive. Located in src/documents/management/commands/document_exporter.py, it extracts data from the Django ORM using batched queries to minimize memory usage, handles file operations with optional checksum verification, and supports inline encryption of sensitive fields. Whether you are migrating to a new server or implementing automated backups, understanding how to use this command ensures your documents remain portable and secure.
Architecture and Export Workflow
The exporter follows a deliberate stream-oriented architecture to handle large archives without exhausting system memory. According to the source code in document_exporter.py, the process executes seven distinct phases:
- Argument Parsing – The
Command.add_argumentsmethod (lines 77-124) registers flags including-c,-d,-f,-z, and--passphrasethat control export behavior. - Target Preparation – The
handlemethod (lines 31-45) creates a temporary directory when--zipis specified or validates the user-supplied target folder. - Snapshot Creation – The exporter indexes existing files in the target directory to support incremental updates and deletion of stale files.
- Manifest Building – Database models are serialized using
serialize_queryset_batched(lines 70-84) to process records in chunks, with sensitive fields encrypted via_encrypt_record_inline(lines 777-887) when a passphrase is provided. - File Export – For each document,
generate_base_name(lines 549-666) andgenerate_document_targets(lines 770-801) determine output paths, whilecheck_and_copy(lines 655-889) performs optimized file copying. - Split Manifest – When
--split-manifestis used,_write_split_manifest(lines 889-915) creates individual JSON files for document notes and custom fields. - Cleanup –
check_and_write_json(lines 216-255) writes themetadata.jsonversion file, and optional deletion logic removes untracked files when-dis specified (lines 531-549).
The StreamingManifestWriter writes the manifest incrementally to a temporary .tmp file, compares its hash against existing manifests when --compare-json is used, and only replaces the target file when changes occur. This prevents unnecessary timestamp updates that would break incremental backup systems.
Command Line Arguments and Options
The document_exporter command accepts numerous flags that control what data gets exported and how files are handled. Key parameters include:
-cor--compare-checksums– Computes checksums for source files and only copies when the checksum differs from the existing target.-dor--delete– Removes files in the target directory that were not produced by the current export, cleaning up documents deleted from the database.-for--use-filename-format– Applies thePAPERLESS_FILENAME_FORMATsetting to organize exported files.-naor--no-archive– Excludes archive PDFs from the export to reduce size.-ntor--no-thumbnail– Excludes thumbnail images from the export.-zor--zip– Creates a ZIP archive instead of a folder structure.--passphrase– Encrypts sensitive fields (mail passwords, OAuth tokens) defined inCryptMixin.CRYPT_FIELDS_BY_MODEL.--split-manifest– Generates separate JSON manifests for each document's notes and custom fields.
Practical Usage Examples
Basic Export to a Local Folder
Run a complete export to a mounted directory accessible from the host:
docker compose exec -T webserver document_exporter ../export
The ../export path resolves to a directory mounted from the host, making the backup available outside the container.
Incremental Backup with Checksum Verification
Avoid copying unchanged files by enabling checksum comparison:
docker compose exec -T webserver document_exporter ../export -c
The -c flag forces the check_and_copy method to compute file checksums and skip identical files, optimizing network and storage usage.
Export to ZIP Archive
Create a compressed archive with a custom name:
docker compose exec -T webserver document_exporter ../export -z -zn backup-$(date +%Y-%m-%d)
The -z flag triggers ZIP creation, while -zn specifies the archive filename.
Exclude Thumbnails and Archives
Generate a minimal backup containing only original documents and metadata:
docker compose exec -T webserver document_exporter ../export -nt -na
The -nt flag skips thumbnails and -na skips archive PDFs, significantly reducing export size.
Apply Filename Format Settings
Organize exported files using your configured naming scheme:
docker compose exec -T webserver document_exporter ../export -f
When -f is specified, the exporter calls generate_base_name to respect the PAPERLESS_FILENAME_FORMAT configuration variable.
Encrypted Export with Passphrase
Protect sensitive configuration data during export:
docker compose exec -T webserver document_exporter ../export --passphrase "SecureBackupPassphrase2024"
This encrypts fields identified in CryptMixin.CRYPT_FIELDS_BY_MODEL within the JSON manifest. Store this passphrase securely—it is required for successful import via the document importer.
Delete Stale Files
Synchronize the export directory with the current database state:
docker compose exec -T webserver document_exporter ../export -d
The -d flag traverses the target directory after export and removes any files not referenced by the current manifest, effectively mirroring deletions made in the Paperless interface.
Implementation Details and Source Files
Several modules contribute to the exporter's functionality:
src/documents/management/commands/document_exporter.py– Contains the mainCommandclass implementing argument parsing, manifest generation, and file operations.src/documents/management/commands/mixins.py– ProvidesCryptMixinwith_encrypt_record_inlineand decryption helpers used for passphrase protection.src/documents/settings.py– Defines export constants includingEXPORTER_FILE_NAME,EXPORTER_THUMBNAIL_NAME, andEXPORTER_ARCHIVE_NAME.src/documents/file_handling.py– Implementscopy_file_with_basic_statsanddelete_empty_directoriesutilities called by the exporter.docker/rootfs/usr/local/bin/document_exporter– Shell wrapper script that invokespython3 manage.py document_exporterin Docker deployments.
The exporter deliberately uses batched serialization via serialize_queryset_batched to maintain low memory footprints when handling archives containing millions of documents.
Summary
- The
document_exportercommand creates portable, complete backups by serializing database records and copying files fromPAPERLESS_MEDIA_ROOT. - StreamingManifestWriter enables efficient incremental exports by writing to temporary files and comparing hashes before replacement.
- Use
-cfor checksum-based incremental backups and-dto remove stale files from the target directory. - Encryption via
--passphraseprotects sensitive credentials stored in the database but requires the same passphrase for restoration. - The command is exposed as a standalone executable in Docker containers at
/usr/local/bin/document_exporter, wrapping the Django management command.
Frequently Asked Questions
What file formats does the document_exporter create?
The exporter generates JSON manifest files describing database models, copies original documents in their native formats (PDF, images, etc.), and includes thumbnail images and archive PDFs unless excluded with -nt or -na. When --zip is used, these files are packaged into a single ZIP archive; otherwise, they maintain a folder structure mirroring the database organization.
Can I use document_exporter for incremental backups?
Yes. Use the -c (--compare-checksums) flag to copy only files that have changed based on checksum comparison, and --compare-json to skip rewriting manifest files that haven't changed. For automated incremental backups, combine these with -d (--delete) to remove files belonging to documents deleted from the database, ensuring the export mirrors the current state exactly.
How do I restore from a document_exporter backup?
Restore using the complementary document_importer command, which reads the JSON manifest files created by the exporter. If you used encryption (--passphrase), you must provide the identical passphrase during import. The importer recreates database records and copies files back into the Paperless-ngx media directory while preserving metadata, tags, and custom fields.
Why are some files missing from my export?
Files may be excluded intentionally via -nt (no thumbnails) or -na (no archives), or the exporter may have skipped them due to checksum matching when using -c and the files already exist in the target. Additionally, if using --data-only, no document files are exported at all—only the database JSON manifests. Check your command flags and verify the export directory permissions if files appear missing unexpectedly.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →