How wacli Downloads and Stores Media Files Locally: A Complete Technical Guide
wacli downloads and stores media files locally through a four-stage pipeline that queries SQLite for encrypted metadata, resolves sanitized filesystem paths, streams WhatsApp media to temporary files, and atomically moves them to final destinations while updating the database state.
wacli is an open-source command-line interface for WhatsApp that automates media extraction from encrypted chat backups. Understanding how wacli downloads and stores media files locally reveals the tool’s robust architecture featuring atomic file operations, deterministic path sanitization, and SQLite-backed state management that prevents data corruption during concurrent sync operations.
The Four-Stage Media Download Pipeline
wacli implements a deterministic workflow for every media file, ensuring data integrity from metadata retrieval through final persistence. Each stage is atomic and failure-resistant, allowing the system to resume or retry without leaving corrupted files on disk.
Stage 1: Metadata Retrieval from SQLite
The process begins with GetMediaDownloadInfo in internal/store/media.go (lines 8-33). This function queries the messages table to extract the cryptographic parameters required for decryption: DirectPath, MediaKey, FileEncSHA256, FileSHA256, FileLength, MediaType, and the original filename.
info, err := a.DB().GetMediaDownloadInfo(chat, id)
If critical fields such as MediaType, DirectPath, or MediaKey are empty, the operation aborts immediately because the encrypted media cannot be reconstructed without the complete metadata.
Stage 2: Safe Path Resolution and Sanitization
Next, ResolveMediaOutputPath in internal/app/media.go (lines 23-49) constructs a deterministic, sanitized directory hierarchy. The function generates a path following this strict structure:
<store>/media/<sanitized-chat-jid>/<sanitized-msg-id>/<media-type>/<sanitized-filename>
The filename derivation logic (lines 52-74) prioritizes the original message filename, falling back to a MIME-type-based name if unavailable. If the user provides a --output flag, the function respects absolute paths or directories, creating the necessary parent directories with os.MkdirAll before returning the final target.
Stage 3: Atomic Streaming Download
The actual download occurs in DownloadMediaToFile within internal/wa/media.go (lines 31-80). This method ensures atomic file operations to prevent partial writes from appearing in the final location:
- Connection validation: Verifies the WhatsApp client is connected.
- Directory creation: Ensures the target directory exists.
- Temporary file creation: Opens a temporary file with
os.CreateTempinside the target directory. - Encrypted streaming: Calls
cli.DownloadMediaWithPathToFileto stream the encrypted media directly to the temporary file. - Atomic move: Closes the file and renames it to the final destination using
os.Rename, ensuring that incomplete downloads never appear at the target path.
bytes, err := a.WA().DownloadMediaToFile(
ctx,
info.DirectPath,
info.FileEncSHA256,
info.FileSHA256,
info.MediaKey,
info.FileLength,
info.MediaType,
"", // mms type (unused)
target,
)
Stage 4: State Persistence in SQLite
Finally, MarkMediaDownloaded in internal/store/media.go (lines 55-61) updates the messages table with the local_path and downloaded_at timestamp. This atomic update ensures that subsequent sync operations or CLI queries recognize the file as present, preventing redundant network requests and enabling idempotent operations.
now := time.Now().UTC()
_ = a.DB().MarkMediaDownloaded(info.ChatJID, info.MsgID, target, now)
Background Sync and Concurrent Downloads
Beyond individual CLI commands, wacli supports bulk media synchronization through runMediaWorkers in internal/app/media.go (lines 76-112). When executing wacli sync --download-media, the application spawns a pool of workers that consume mediaJob structs from a channel.
Each worker invokes downloadMediaJob, which executes the same four-stage pipeline described above. This concurrent architecture allows wacli to saturate network bandwidth while maintaining SQLite transaction safety and atomic file operations for each individual download.
CLI Usage and Programmatic Examples
Command-Line Download
To download a specific message’s media file, use the wacli media download command defined in cmd/wacli/media.go:
wacli media download \
--chat [email protected] \
--id 3EB0F1234ABCD5678 \
--output ~/Downloads/whatsapp-media
The output displays the final file path and size:
/home/me/.wacli/store/media/[email protected]/3EB0F1234ABCD5678/image/photo-3EB0F1234ABCD5678.jpg (1.2 MB)
Programmatic Integration
You can invoke the download pipeline directly from Go code:
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
defer cancel()
app, _, _ := newApp(ctx, flags, true, false) // initialization omitted for brevity
info, _ := app.DB().GetMediaDownloadInfo(chatJID, msgID)
target, _ := app.ResolveMediaOutputPath(info, "")
bytes, err := app.WA().DownloadMediaToFile(
ctx,
info.DirectPath,
info.FileEncSHA256,
info.FileSHA256,
info.MediaKey,
info.FileLength,
info.MediaType,
"", // mms type (unused)
target,
)
fmt.Printf("downloaded %d bytes to %s\n", bytes, target)
This mirrors the internal implementation used by the CLI and background sync workers.
Key Implementation Files
internal/wa/media.go(lines 31-80): Low-level download implementation that streams encrypted media from WhatsApp and writes files atomically.internal/app/media.go(lines 23-112): Path resolution logic, filename sanitization, and background worker orchestration for concurrent downloads.internal/store/media.go(lines 8-61): SQLite queries for metadata retrieval (GetMediaDownloadInfo) and persistence (MarkMediaDownloaded).cmd/wacli/media.go(lines 28-71): User-facing CLI command that coordinates the download pipeline.
Summary
- wacli downloads and stores media files locally through a rigorous four-stage pipeline ensuring atomicity and data integrity.
- Metadata retrieval via
GetMediaDownloadInfoininternal/store/media.gofetches cryptographic parameters from themessagestable. - Path resolution through
ResolveMediaOutputPathcreates sanitized, deterministic hierarchies under<store>/media/. - Atomic downloads using
DownloadMediaToFileininternal/wa/media.gowrite to temporary files before renaming them to final destinations. - State tracking via
MarkMediaDownloadedupdates SQLite withlocal_pathanddownloaded_atto enable idempotent operations. - Concurrent processing through
runMediaWorkerssupports bulk synchronization withwacli sync --download-media.
Frequently Asked Questions
How does wacli prevent corrupted files during download?
wacli implements atomic file operations in DownloadMediaToFile (internal/wa/media.go). The function streams encrypted media to a temporary file created with os.CreateTemp inside the target directory. Only after the download completes successfully and the file is closed does it execute os.Rename to move the temporary file to its final destination. This ensures that incomplete or corrupted downloads never appear at the target path, as the rename operation is atomic on POSIX systems.
What database fields does wacli use to track media downloads?
The messages table in SQLite tracks media through specific fields queried by GetMediaDownloadInfo (internal/store/media.go): DirectPath, MediaKey, FileEncSHA256, FileSHA256, FileLength, MediaType, and the original filename. After a successful download, MarkMediaDownloaded updates the same row with local_path (the absolute filesystem path) and downloaded_at (UTC timestamp), enabling idempotent operations and preventing redundant network requests.
Can wacli download media from multiple chats simultaneously?
Yes, wacli supports concurrent media downloads through the runMediaWorkers function in internal/app/media.go. When invoked via wacli sync --download-media, the application spawns a pool of workers that consume mediaJob structs from a buffered channel. Each worker independently executes the four-stage download pipeline for individual messages, allowing the tool to saturate network bandwidth while maintaining SQLite transaction safety and atomic file operations for each file.
How does wacli determine the local filename and directory structure?
The ResolveMediaOutputPath function in internal/app/media.go constructs a deterministic, sanitized hierarchy following the pattern: <store>/media/<sanitized-chat-jid>/<sanitized-msg-id>/<media-type>/<sanitized-filename>. The function sanitizes the chat JID and message ID to create safe directory names, derives the filename from original message metadata or falls back to a MIME-type-based name, and respects user-specified --output paths. It creates all necessary parent directories using os.MkdirAll before returning the absolute target path.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →