How wacli's Message Upsert Logic Prevents Duplicate Entries During History Sync Replays
wacli prevents duplicate message entries during history sync replays by using SQLite's INSERT … ON CONFLICT clause with a unique composite constraint on (chat_jid, msg_id), ensuring that existing messages are updated rather than duplicated when the same history payload is processed multiple times.
When syncing historical WhatsApp data, the steipete/wacli tool must handle redundant HistorySync events without creating duplicate database entries. The wacli message upsert logic leverages database-level constraints to guarantee idempotent storage, ensuring that replaying the same sync operation multiple times produces consistent results.
Understanding the History Sync Challenge
During initial bootstrap or manual synchronization, wacli receives HistorySync events containing batches of historical messages. These events may be replayed due to network retries, retries, or partial sync operations. Without proper duplicate detection, the same message would be inserted multiple times, corrupting the chat history and inflating storage usage.
Database-Level Protection via Unique Constraints
The foundation of wacli's duplicate prevention lies in its SQLite schema design and conflict resolution strategy.
Composite Key Design in migrations.go
The database schema defined in internal/store/migrations.go establishes a unique constraint on the combination of chat_jid and msg_id. This composite key ensures that a single message can be identified uniquely within its specific chat context, preventing collisions between messages with identical IDs from different conversations.
SQLite ON CONFLICT Resolution Strategy
When store.UpsertMessage executes in internal/store/messages.go, it uses the INSERT … ON CONFLICT SQL construct:
INSERT INTO messages( … )
VALUES ( … )
ON CONFLICT(chat_jid, msg_id) DO UPDATE SET
chat_name = COALESCE(NULLIF(excluded.chat_name,''), messages.chat_name),
sender_jid = excluded.sender_jid,
…
file_length = CASE WHEN excluded.file_length>0 THEN excluded.file_length ELSE messages.file_length END
This query operates in two phases:
- Attempted Insert: SQLite tries to insert the new message row.
- Conflict Resolution: If the
(chat_jid, msg_id)combination already exists, theDO UPDATEclause activates, merging the new data with existing fields usingCOALESCElogic to preserve non-empty values.
The Message Upsert Implementation
The UpsertMessage function in internal/store/messages.go (lines 30-38) encapsulates this logic in Go:
func (d *Store) UpsertMessage(params UpsertMessageParams) error {
_, err := d.sql.Exec(`
INSERT INTO messages(chat_jid, msg_id, sender_jid, timestamp, from_me, text, chat_name, file_length, ...)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ...)
ON CONFLICT(chat_jid, msg_id) DO UPDATE SET
chat_name = COALESCE(NULLIF(excluded.chat_name,''), messages.chat_name),
sender_jid = excluded.sender_jid,
timestamp = excluded.timestamp,
from_me = excluded.from_me,
text = excluded.text,
file_length = CASE WHEN excluded.file_length>0 THEN excluded.file_length ELSE messages.file_length END
`, params.ChatJID, params.MsgID, params.SenderJID, params.Timestamp, params.FromMe, params.Text, params.ChatName, params.FileLength)
return err
}
This implementation guarantees that calling UpsertMessage with identical parameters multiple times results in exactly one database row, with subsequent calls updating rather than duplicating the record.
History Sync Processing Pipeline
The orchestration layer that feeds messages into this upsert logic resides in internal/app/sync.go.
Event Handling in app/sync.go
When a HistorySync event arrives, the application iterates through all conversations and their messages (lines 15-39):
func (a *App) handleHistorySync(v *events.HistorySync) {
for _, conv := range v.Data.Conversations {
chatID := strings.TrimSpace(conv.GetID())
for _, m := range conv.Messages {
// Parse the raw WhatsApp message into a structured format
pm := wa.ParseHistoryMessage(chatID, m.Message)
// The upsert ensures idempotency - safe to call on replays
if err := a.storeParsedMessage(ctx, pm); err == nil {
messagesStored.Add(1) // counted even if it was an update
}
}
}
}
The storeParsedMessage method delegates to store.UpsertMessage, creating the complete pipeline from raw WhatsApp events to deduplicated database storage.
Idempotency Guarantees and Conflict Resolution
The combination of schema constraints and SQL conflict resolution delivers several key guarantees:
- True Idempotency: Replaying the same
HistorySyncpayload n times produces identical database state as processing it once. - Data Preservation: The
COALESCEandCASEexpressions in theDO UPDATEclause ensure that existing non-empty values (likechat_name) are not overwritten by empty strings, while newer values (likefile_lengthif greater than zero) take precedence. - Atomic Operations: SQLite's
ON CONFLICThandling occurs within the same transaction boundary, preventing race conditions between inserts and updates.
Summary
- wacli prevents duplicate message entries using SQLite's
INSERT … ON CONFLICTclause with a unique constraint on(chat_jid, msg_id). - The
UpsertMessagefunction ininternal/store/messages.goexecutes atomic upsert operations that update existing rows rather than creating duplicates. - During history sync replays,
internal/app/sync.goprocesses each message through this upsert pipeline, ensuring idempotent storage even when the same payload is received multiple times. - The conflict resolution strategy preserves existing data using
COALESCElogic while allowing newer values to override older ones where appropriate.
Frequently Asked Questions
What happens when wacli encounters a message with the same ID during history sync?
When wacli encounters a message with an existing (chat_jid, msg_id) combination, SQLite's ON CONFLICT clause triggers an update rather than an insertion. The existing row is modified to merge new data with preserved values, ensuring no duplicate rows are created while keeping the database current.
Does wacli's upsert logic preserve existing message data when conflicts occur?
Yes, the upsert logic specifically preserves existing data through selective field updates. The SQL query uses COALESCE(NULLIF(excluded.chat_name,''), messages.chat_name) to retain existing chat names when incoming data is empty, and conditional logic like CASE WHEN excluded.file_length>0 ensures file metadata is only updated when valid new data exists.
Which database constraint prevents duplicate entries in wacli?
The database schema defines a unique constraint on the composite key (chat_jid, msg_id) in the messages table. This constraint, defined in internal/store/migrations.go, ensures that the combination of chat identifier and message identifier remains unique across the entire database, enabling the conflict detection mechanism that powers the upsert logic.
Is the history sync process in wacli idempotent?
Yes, the history sync process is fully idempotent. The handleHistorySync function in internal/app/sync.go processes each message through storeParsedMessage, which calls the atomic UpsertMessage operation. Because the underlying SQL uses INSERT … ON CONFLICT, processing the same HistorySync payload multiple times produces identical database state as processing it once, with no duplicate rows created.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →