Mental Model
A single database server is a ticking time bomb. Replication is the act of maintaining identical clones of your database so that if the primary dies, a clone can seamlessly take over.
Once your application hits production, High Availability (HA) becomes paramount. If your database goes down, your entire business stops making money.
1. The Binary Log (Binlog)
While the Redo Log (discussed in the previous module) is used internally by InnoDB for crash recovery, the Binary Log (Binlog) is used by MySQL at the server level to record all data-modifying queries.
The Binlog is the foundation of replication. There are two main formats:
- Statement-Based: Logs the exact SQL query (e.g.,
DELETE FROM users WHERE age > 99). - Row-Based: Logs the actual row data that changed. This is much safer and the modern default, as statement-based logging can cause inconsistencies with non-deterministic functions like
NOW().
2. How Master-Slave (Primary-Replica) Replication Works
- The Primary node writes changes to its local Binlog.
- The Replica node runs an I/O Thread that connects to the Primary, reads the Binlog, and copies it to a local Relay Log.
- The Replica runs a SQL Thread that constantly reads the Relay Log and applies the changes to its own database.
By directing all INSERT/UPDATE queries to the Primary and routing SELECT queries to multiple Replicas, you can horizontally scale your read capacity.
3. Asynchronous vs Semi-Synchronous
Asynchronous Replication (Default): The Primary executes the transaction, writes to the Binlog, and immediately returns "Success" to the client. It doesn't wait for the Replica. Risk: If the Primary dies a millisecond later, the Replica might not have received the event. Data is lost.
Semi-Synchronous Replication: The Primary executes the transaction, but waits to return "Success" until at least one Replica acknowledges it has received the Binlog event and written it to its Relay Log. Trade-off: Higher latency for writes, but guarantees zero data loss during a failover.
4. Failover and Orchestration
If the Primary dies, how does the system recover? You cannot manually SSH into a server at 3:00 AM. Tools like Orchestrator or MHA (Master High Availability) continuously monitor the topology. If the Primary fails, they:
- Identify the Replica with the most up-to-date Relay Log.
- Promote that Replica to be the new Primary.
- Reconfigure the remaining Replicas to point to the new Primary.
- Update ProxySQL or the application's DNS to route traffic to the new IP.
Practice Question
Scenario: You have a Primary database and a Read Replica. Users are complaining that immediately after they update their profile picture, the page refreshes and shows their old picture. Question: What is this phenomenon called and how do you fix it?
View Answer
This is called **Replication Lag**. Because replication is asynchronous, it takes a few milliseconds (or seconds under heavy load) for the update to reach the Replica. When the page refreshes, the app reads from the lagging Replica. **Fix**: Implement "Read-After-Write Consistency". If a user modifies their own data, route their specific reads to the Primary node for the next few seconds, or use a distributed cache like Redis to serve their updated state instantly.Technical Trade-offs: Architectural Decision
| Strategy | Scalability | Complexity | Operational Cost | Performance |
|---|---|---|---|---|
| Monolithic | Low | Low | Low | Fast (Local) |
| Microservices | Very High | High | High | Slower (Network) |
| Serverless | Infinite | Medium | Variable | Variable (Cold Starts) |
Production Readiness Checklist
Before deploying this architecture to a production environment, ensure the following Staff-level criteria are met:
- High Availability: Have we eliminated single points of failure across all layers?
- Observability: Are we exporting structured JSON logs, custom Prometheus metrics, and OpenTelemetry traces?
- Circuit Breaking: Do all synchronous service-to-service calls have timeouts and fallbacks (e.g., via Resilience4j)?
- Idempotency: Can our APIs handle retries safely without causing duplicate side effects?
- Backpressure: Does the system gracefully degrade or return HTTP 429 when resources are saturated?
Verbal Interview Script
Interviewer: "We want to ensure zero data loss in our payments database if the primary node catches fire. How do we architect this?"
Candidate: "To guarantee zero data loss (RPO = 0), standard asynchronous replication is insufficient. We must enable Semi-Synchronous replication. When a payment transaction commits, the primary node will block and wait until at least one replica confirms it has written the transaction to its relay log. This introduces a slight latency penalty equal to the network round-trip time between the nodes, so the replica should ideally be in the same region but a different Availability Zone (AZ). For automatic recovery, we would pair this with a tool like Orchestrator and ProxySQL to automatically promote the synchronized replica and reroute traffic within seconds."