Implementing Database Replication: Master-Slave vs Master-Master Architectures

Master-slave replication uses a single writable primary with read-only replicas, while master-master replication enables multiple writable nodes that require conflict resolution but eliminate write bottlenecks.

Database replication is a core technique for scaling relational databases and improving availability in distributed systems. According to the donnemartin/system-design-primer repository, choosing between master-slave and master-master architectures represents a critical design decision that impacts consistency models, failover complexity, and write throughput. This guide examines both patterns as documented in the repository's core documentation and implementation examples.

Understanding Database Replication Patterns

What is Master-Slave Replication?

In the README.md of the system-design-primer, master-slave replication is defined as a topology where a single primary node (the master) handles all write operations while one or more secondary nodes (slaves) serve read requests. The master streams changes from its binary log to each slave, which replay these events asynchronously or semi-synchronously. If the master fails, a slave must be promoted to become the new writer.

What is Master-Master Replication?

The repository describes master-master replication (also called multi-master) as a configuration where two or more nodes accept write operations simultaneously. Each node records its own writes and replicates them to the other nodes. This enables continuous read/write service even if one node goes down, but introduces challenges such as conflict resolution, increased write latency, and the need for load-balancer or application-level routing logic.

Architectural Comparison

Write Path Characteristics

Master-Slave: A single writer eliminates write conflicts and maintains strong consistency for write operations. However, this creates a write bottleneck and requires vertical scaling of the master node to handle write throughput.

Master-Master: Multiple writers distribute load across nodes, enabling geographic distribution of write operations. This requires conflict resolution strategies such as last-write-wins timestamps, vector clocks, or CRDTs (Conflict-free Replicated Data Types) to handle simultaneous updates to the same data.

Read Path and Consistency

Master-Slave: Slaves handle read traffic, reducing load on the master and enabling horizontal read scaling. However, read-after-write consistency may be compromised on slaves due to replication lag between the master and replicas.

Master-Master: All nodes can serve reads immediately after local writes, providing low-latency read access. However, cross-node reads may encounter eventual consistency where not all nodes have received the latest writes from other masters.

Failover and Availability

Master-Slave: When the master fails, automated or manual promotion logic must convert a slave into the new master. This process involves updating DNS entries or connection strings and ensuring the promoted slave has processed all binary log events before accepting writes.

Master-Master: Any remaining master can continue accepting writes if one node fails, eliminating the need for promotion logic. However, this requires careful handling of split-brain scenarios where network partitions cause nodes to diverge.

Implementation Examples

The donnemartin/system-design-primer repository provides concrete configuration examples for both replication patterns.

Master-Slave Setup (MySQL)

Configure the master node in my.cnf:

[mysqld]
server-id=1                # Master: 1, each slave uses a unique ID > 1

log-bin=mysql-bin
binlog-format=row

Create the replication user on the master:

-- On master
CREATE USER 'repl'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
FLUSH PRIVILEGES;
FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;   -- note File and Position
UNLOCK TABLES;

Configure each slave to connect to the master:

-- On each slave
CHANGE MASTER TO
  MASTER_HOST='master-host',
  MASTER_USER='repl',
  MASTER_PASSWORD='repl_password',
  MASTER_LOG_FILE='mysql-bin.000001',
  MASTER_LOG_POS=12345;
START SLAVE;

Master-Master Setup (Galera Cluster)

For master-master replication, configure Galera Cluster in my.cnf:

[mysqld]
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://node1,node2"
wsrep_node_name="node1"
wsrep_node_address="192.168.1.1"
wsrep_sst_method=rsync
binlog_format=row
default_storage_engine=InnoDB

With Galera, writes automatically replicate to all nodes:

-- Same on both nodes; they automatically replicate writes.
INSERT INTO users (id, name) VALUES (1, 'Alice');
-- Conflict example: if both nodes insert id=1, Galera aborts one transaction.

Application-Level Routing

The repository suggests implementing connection routing in application code to handle write/read splitting:

def get_connection(write=False):
    if write:
        # Use a load balancer that forwards to any active master

        return connect_to("db-master-lb")
    else:
        # Reads can go to any replica

        return connect_to("db-replica-pool")

Key Repository References

The database replication patterns are documented in the following donnemartin/system-design-primer files:

  • README.md (root): Primary documentation of both replication patterns, including the Master-slave replication and Master-master replication sections that detail advantages, disadvantages, and architectural trade-offs.

  • solutions/system_design/web_crawler/README.md: Demonstrates practical application of read replicas in a web crawler system design, showing how master-slave replication supports high-throughput read operations.

Summary

  • Master-slave replication provides a single writable primary with read-only secondaries, offering strong consistency for writes and simple horizontal read scaling, but requires slave promotion during failover and cannot scale writes horizontally.

  • Master-master replication enables multiple writable nodes for geographic distribution and zero-downtime writes, but introduces conflict resolution complexity, eventual consistency challenges, and requires sophisticated load-balancing or application routing.

  • Both patterns require monitoring replication lag, handling network partitions to prevent split-brain scenarios, and implementing automated failover or conflict resolution strategies as documented in the donnemartin/system-design-primer repository.

Frequently Asked Questions

What is the main difference between master-slave and master-master database replication?

Master-slave replication uses a single primary node for all write operations while secondary nodes serve only read requests. Master-master replication allows multiple nodes to accept writes simultaneously, requiring conflict resolution mechanisms but eliminating single points of failure for write operations.

How does failover work when the master node fails in a master-slave topology?

When the master fails, an automated promotion process converts one of the slaves into the new master. This requires updating DNS entries or connection strings and ensuring the promoted slave has processed all binary log events from the failed master before accepting new writes.

What are the consistency implications of choosing master-master replication?

Master-master replication typically provides eventual consistency rather than strong consistency. When two masters accept concurrent writes to the same data, conflicts arise that must be resolved using strategies such as last-write-wins timestamps, vector clocks, or application-level merge logic.

When should I implement master-slave replication instead of master-master?

Choose master-slave replication for read-heavy workloads that require strong write consistency and can tolerate brief downtime during failover. Choose master-master replication for write-heavy applications requiring geographic distribution, zero-downtime writes, or active-active data center configurations where conflict resolution complexity is acceptable.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →