Database schema design is architecture. Like all architectural decisions, it must be driven by the query patterns and access patterns you actually have—not the ones you imagine you might need.
In this case study, we deconstruct and redesign an e-commerce database schema for a high-growth platform transitioning from $10\text{M}$ ARR to processing 1 million orders per day. The naive weekend-build schema is crashing under production scale: checkout queries take $8$ seconds, flash sales lock inventory tables, and reporting queries crash the primary instance. We walk through a complete, production-ready MySQL 8.0 database architecture built on InnoDB, detailed composite indexing math, and zero-downtime schema migration strategies.
System Requirements and Goals
Designing a relational schema for a million daily orders requires establishing firm transactional boundaries and operational bounds.
1. Functional Requirements
- Catalog Browsing: Buyers must browse products, filter by category, and see current prices and real-time inventory levels.
- Resilient Order Checkout: Safely execute multi-item purchases. Inventory reservation must be highly concurrent, preventing overselling during high-demand flash sales.
- Audit Trails: Record complete historical pricing records and inventory transaction movements; prices paid at checkout must remain immutable.
- Soft Deletes: Ensure soft-deletion standards across critical tables (
users,products) to preserve accounting and tax integrity.
2. Non-Functional Requirements
- Transaction Latency: $99%$ of checkout transaction writes must complete in under $50\text{ ms}$ on the database layer.
- Scale Capability: Core tables must scale to sustain over $100\text{ Million}$ rows without query degradation.
- Operational Isolation: Long-running analytical queries must not affect OLTP write latencies.
- Zero Downtime: Schema migrations must execute online without locking tables or locking out buyers.
High-Level Design Architecture
To support $1,000,000$ orders per day, the database tier must scale writes independently of reads. The architecture below showcases the decoupling of transactional OLTP write paths from analytical reporting pipelines using primary-replica replication:
graph TD
%% Define Nodes
Client[Mobile/Web Buyers] -->|HTTPS| API[API Gateway]
subgraph "Application Layer"
API -->|Route Checkout Writes| OrderSvc[Order Microservice]
API -->|Route Read Queries| ReadSvc[Catalog & Search Service]
end
subgraph "Relational Database Layer"
OrderSvc -->|Primary OLTP Writes| MySQLPrimary[MySQL Primary Instance - Master]
MySQLPrimary -->|Semi-Synchronous Binlog Replication| MySQLReplica1[MySQL Read Replica 1]
MySQLPrimary -->|Semi-Synchronous Binlog Replication| MySQLReplica2[MySQL Read Replica 2]
ReadSvc -->|Load Balanced Reads| MySQLReplica1
ReadSvc -->|Load Balanced Reads| MySQLReplica2
end
subgraph "Analytics & Data Warehouse"
MySQLReplica2 -->|Debezium CDC Streams| Kafka[Apache Kafka Telemetry]
Kafka -->|Bulk Insert| ClickHouse[(ClickHouse Data Warehouse)]
ClickHouse -->|BI Tools| AnalyticsPanel[Internal Dashboards]
end
%% Styling
classDef client fill:#2c3e50,stroke:#fff,stroke-width:1px,color:#fff;
classDef app fill:#2980b9,stroke:#fff,stroke-width:1px,color:#fff;
classDef database fill:#e67e22,stroke:#fff,stroke-width:2px,color:#fff;
classDef analytics fill:#27ae60,stroke:#fff,stroke-width:1px,color:#fff;
class Client client;
class API,OrderSvc,ReadSvc app;
class MySQLPrimary,MySQLReplica1,MySQLReplica2 database;
class Kafka,ClickHouse,AnalyticsPanel analytics;
API Design and Interface Contracts
Transactional operations require robust REST endpoints that map neatly to relational write bounds.
1. Checkout Order Creation Payload (POST /api/v1/orders)
Client applications submit structured checkout payloads containing user profiles, shipping paths, and a specific array of item identifiers:
{
"userId": 9920194,
"shippingAddressId": 48201,
"currency": "USD",
"items": [
{
"productId": 2049,
"quantity": 2
},
{
"productId": 5092,
"quantity": 1
}
]
}
2. Transaction Response Contract
The application returns HTTP 201 Created upon successful transaction commit, providing structured metadata:
{
"orderId": 882019401,
"status": "pending",
"subtotal": 129.97,
"taxAmount": 10.40,
"shippingAmount": 0.00,
"totalAmount": 140.37,
"createdAt": "2026-05-23T02:30:00.042Z"
}
Low-Level Design & Component Mechanics
To achieve maximum data integrity and low latency, we construct a fully normalized relational Entity-Relationship Diagram (ERD), followed by concrete DDL declarations.
1. Database Entity-Relationship Diagram (ERD)
erDiagram
users ||--o{ addresses : "has"
users ||--o{ orders : "places"
addresses ||--o{ orders : "receives"
products ||--o{ product_prices : "has historical"
products ||--o{ order_items : "sold in"
products ||--o{ inventory_transactions : "audited by"
orders ||--|{ order_items : "contains"
orders ||--o{ inventory_transactions : "references"
users {
bigint id PK
varchar email UK
varchar name
enum status
datetime created_at
datetime updated_at
datetime deleted_at
}
addresses {
bigint id PK
bigint user_id FK
varchar line1
varchar city
varchar postal_code
char country
}
products {
bigint id PK
varchar sku UK
varchar name
enum status
datetime created_at
}
product_prices {
bigint id PK
bigint product_id FK
decimal price
char currency
datetime valid_from
datetime valid_until
}
orders {
bigint id PK
bigint user_id FK
bigint shipping_address_id FK
enum status
decimal total_amount
varchar payment_intent_id
datetime created_at
}
order_items {
bigint id PK
bigint order_id FK
bigint product_id FK
int quantity
decimal unit_price
}
inventory_transactions {
bigint id PK
bigint product_id FK
int delta
enum type
bigint reference_id FK
datetime created_at
}
2. High-Performance MySQL 8.0 DDL Implementations
Below is the complete SQL implementation schema, utilizing optimized InnoDB parameters, appropriate character sets, foreign keys, and covering indexes.
-- Create database with high-performance default collations
CREATE DATABASE IF NOT EXISTS ecommerce
CHARACTER SET utf8mb4
COLLATE utf8mb4_0900_ai_ci;
USE ecommerce;
-- 1. Users Table: standardized constraints and soft deletes
CREATE TABLE users (
id BIGINT UNSIGNED AUTO_INCREMENT,
email VARCHAR(320) NOT NULL, -- RFC 5321 max length
name VARCHAR(255) NOT NULL,
status ENUM('active', 'suspended', 'deleted') NOT NULL DEFAULT 'active',
-- Mandatory Audit Columns
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
updated_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
deleted_at DATETIME(3) NULL DEFAULT NULL,
CONSTRAINT pk_users PRIMARY KEY (id),
CONSTRAINT uq_users_email UNIQUE (email) -- Prevents duplicate registrations
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- 2. Addresses Table: normalized to support multiple delivery targets
CREATE TABLE addresses (
id BIGINT UNSIGNED AUTO_INCREMENT,
user_id BIGINT UNSIGNED NOT NULL,
line1 VARCHAR(255) NOT NULL,
line2 VARCHAR(255) NULL,
city VARCHAR(100) NOT NULL,
state VARCHAR(100) NOT NULL,
postal_code VARCHAR(20) NOT NULL,
country CHAR(2) NOT NULL, -- ISO 3166-1 alpha-2 code
is_default BOOLEAN NOT NULL DEFAULT FALSE,
deleted_at DATETIME(3) NULL DEFAULT NULL,
CONSTRAINT pk_addresses PRIMARY KEY (id),
CONSTRAINT fk_addresses_user FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE RESTRICT,
INDEX idx_addresses_user_default (user_id, is_default)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- 3. Products Table: indexed catalog listings
CREATE TABLE products (
id BIGINT UNSIGNED AUTO_INCREMENT,
sku VARCHAR(64) NOT NULL, -- Unique stock keeping unit code
name VARCHAR(512) NOT NULL,
description TEXT NULL,
category_id BIGINT UNSIGNED NOT NULL,
status ENUM('active', 'draft', 'discontinued') NOT NULL DEFAULT 'draft',
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
updated_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
deleted_at DATETIME(3) NULL DEFAULT NULL,
CONSTRAINT pk_products PRIMARY KEY (id),
CONSTRAINT uq_products_sku UNIQUE (sku),
INDEX idx_products_category_status (category_id, status), -- Compound: catalog filters
FULLTEXT INDEX ft_products_name (name) -- Full-text keyword search
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- 4. Product Prices Table: historical pricing registry
CREATE TABLE product_prices (
id BIGINT UNSIGNED AUTO_INCREMENT,
product_id BIGINT UNSIGNED NOT NULL,
price DECIMAL(12,4) NOT NULL, -- Supports multi-currency decimal scales
currency CHAR(3) NOT NULL DEFAULT 'USD', -- ISO 4217 currency key
valid_from DATETIME(3) NOT NULL,
valid_until DATETIME(3) NULL, -- NULL means currently active
CONSTRAINT pk_prices PRIMARY KEY (id),
CONSTRAINT fk_prices_product FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE RESTRICT,
CONSTRAINT chk_price_positive CHECK (price >= 0.0000),
INDEX idx_prices_product_timeline (product_id, valid_from, valid_until) -- Fast historic lookup
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- 5. Orders Table: core OLTP transaction registry
CREATE TABLE orders (
id BIGINT UNSIGNED AUTO_INCREMENT,
user_id BIGINT UNSIGNED NOT NULL,
shipping_address_id BIGINT UNSIGNED NOT NULL,
status ENUM('pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded')
NOT NULL DEFAULT 'pending',
currency CHAR(3) NOT NULL DEFAULT 'USD',
subtotal DECIMAL(12,4) NOT NULL,
tax_amount DECIMAL(12,4) NOT NULL DEFAULT 0.0000,
shipping_amount DECIMAL(12,4) NOT NULL DEFAULT 0.0000,
total_amount DECIMAL(12,4) NOT NULL,
-- Stripe Payment Tracking
payment_intent_id VARCHAR(255) NULL,
paid_at DATETIME(3) NULL,
-- Audit & Life-cycle
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
updated_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
cancelled_at DATETIME(3) NULL,
cancel_reason VARCHAR(512) NULL,
CONSTRAINT pk_orders PRIMARY KEY (id),
CONSTRAINT fk_orders_user FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE RESTRICT,
CONSTRAINT fk_orders_address FOREIGN KEY (shipping_address_id) REFERENCES addresses (id) ON DELETE RESTRICT,
INDEX idx_orders_user_created (user_id, created_at DESC), -- "My Orders" customer portal
INDEX idx_orders_status_created (status, created_at), -- Worker queue pull
INDEX idx_orders_payment_intent (payment_intent_id) -- Webhook idempotency check
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- 6. Order Items Table: normalized order details (no raw text blobs)
CREATE TABLE order_items (
id BIGINT UNSIGNED AUTO_INCREMENT,
order_id BIGINT UNSIGNED NOT NULL,
product_id BIGINT UNSIGNED NOT NULL,
quantity INT UNSIGNED NOT NULL,
unit_price DECIMAL(12,4) NOT NULL, -- Frozen price snapshot at checkout
line_total DECIMAL(12,4) NOT NULL,
CONSTRAINT pk_order_items PRIMARY KEY (id),
CONSTRAINT fk_items_order FOREIGN KEY (order_id) REFERENCES orders (id) ON DELETE RESTRICT,
CONSTRAINT fk_items_product FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE RESTRICT,
INDEX idx_items_order (order_id),
INDEX idx_items_product (product_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- 7. Inventory Transactions Table: append-only ledger for flash sales
CREATE TABLE inventory_transactions (
id BIGINT UNSIGNED AUTO_INCREMENT,
product_id BIGINT UNSIGNED NOT NULL,
delta INT NOT NULL, -- Positive: restock, Negative: sales reservation
type ENUM('purchase', 'sale', 'return', 'adjustment', 'reservation') NOT NULL,
reference_id BIGINT UNSIGNED NULL, -- Corresponds to order_id or ingestion batch
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
CONSTRAINT pk_inventory PRIMARY KEY (id),
CONSTRAINT fk_inv_product FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE RESTRICT,
INDEX idx_inv_product_timeline (product_id, created_at, delta) -- Fast sum aggregations
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Scaling Challenges & Production Bottlenecks
1. The Naive Stock Update Deadlock
In standard e-commerce designs, products have a single stock column. When $10,000$ concurrent buyers attempt a checkout transaction during a flash sale, they run:
-- Anti-Pattern: Triggers table locks and deadlocks under concurrent execution
UPDATE products SET stock = stock - 1 WHERE id = 1290;
Why it crashes:
InnoDB locks rows in index order. If transaction $T_1$ updates Product A then Product B, and transaction $T_2$ updates Product B then Product A, a Deadlock occurs. Under high write concurrency, lock escalation forces checkout latency to skyrocket and crashes transaction pipelines.
The Ledger Solution:
We decouple inventory updates into a high-concurrency, append-only inventory ledger (inventory_transactions). Instead of locking the products table row for writes, checkouts append a negative delta (delta = -2). Appends execute instantly using row-level inserts, eliminating lock contention.
2. High-Cardinality Indexing Memory Limits
Adding indexes to speed up every query pattern creates a secondary bottleneck: Index Bloat. Every index consumes memory within the InnoDB buffer pool. If indexes exceed physical memory sizes, MySQL is forced to swap pages to disk, causing query speeds to plunge.
Back-of-the-Envelope Estimation:
- Table Size: $100,000,000$ order rows.
- Primary Key:
BIGINT($8$ bytes) + Secondary Key onpayment_intent_id(VARCHAR(255)averaging $32$ bytes). - Storage Footprint: $$\text{Leaf Node Size} = 100,000,000 \times (8\text{ B} + 32\text{ B} + 38\text{ B overhead}) \approx 7.8\text{ GB per index}$$
- Result: Adding five unnecessary indexes will demand over $40\text{ GB}$ of dedicated RAM just to store index maps in the buffer pool.
- Mitigation: Leverage Covering Indexes (e.g.,
idx_orders_user_created) and keep index columns narrow.
Technical Trade-offs & Strategic Compromises
When architecting a database schema at scale, we must balance normalization strictness against query performance.
| Database Design Choice | Pros | Cons | OLTP Write/Read Impact |
|---|---|---|---|
| Fully Normalized Schema (3NF) | * Guarantees absolute data consistency. * Zero duplicate data prevents anomalies. |
* Requires complex, multi-join queries to assemble orders, increasing read latency. | * OLTP Write: Extremely Fast (Single table writes) * OLTP Read: Slow (Demands Joins) |
Pessimistic Locking (SELECT FOR UPDATE) |
* Zero chance of overselling inventory during concurrent sales. | * Blocks parallel checkout threads, creating queue bottlenecks. | * Write Latency: High under load * Data Safety: 100% Guaranteed |
| Optimistic Concurrency Control (OCC) | * Non-blocking read-writes; highly scalable. | * High rate of transaction rollbacks (retry overhead) during massive concurrent flash sales. | * Scalability: Excellent * Conflict Cost: High under hotkeys |
| Append-Only Inventory Ledger | * Eliminates row lock contentions and deadlocks completely. | * Requires summing transactions to compute current stock quantities. | * Write Throughput: Ultra-High * Compute Overhead: Medium (requires cached summary tables) |
Failure Scenarios and Fault Tolerance
1. Flash Sale Hotspots and Inventory Overselling
When inventory drops to $0$, how do we prevent overselling without locking the inventory transaction table?
Fault-Tolerance Mitigation:
We execute atomic inventory allocation using a single check-and-insert query within a transaction. We calculate the sum of available stock in real-time, executing the insert only if available inventory satisfies the purchase volume:
START TRANSACTION;
-- Atomic check-and-reserve transaction
INSERT INTO inventory_transactions (product_id, delta, type, reference_id)
SELECT 2049, -2, 'reservation', 882019401
FROM (
-- Sum all historical inventory transactions for the product
SELECT IFNULL(SUM(delta), 0) AS available
FROM inventory_transactions
WHERE product_id = 2049
) AS stock
WHERE stock.available >= 2;
-- The application checks affected rows:
-- If Rows Affected = 1: Reservation succeeded. Commit transaction.
-- If Rows Affected = 0: Out of stock! Rollback transaction immediately.
2. Replica Lag Cascades
If a database replica experiences replication lag (due to long-running analytical queries), a buyer might purchase a product, receive a success page, and refresh the screen—only to see the order missing because the read replica has not synced the write.
Fault-Tolerance Mitigation:
- Write-to-Read Pinning: Pin a buyer's requests to the MySQL Primary node for $5$ seconds following any write command.
- Session Consistency Tokens: Attach the latest transaction binlog position (
GTID) to the user's session token. Replicas must evaluate whether they have reached this position before responding; if not, the query routes to the Primary.
Staff Engineer Perspective
[!TIP] Leverage
pt-online-schema-changefor Large Tables: Never run directALTER TABLEcommands on active tables containing more than $1\text{ Million}$ rows. It locks tables, starving connection pools. Use Open-Source schema change tools (like Percona'spt-oscor GitHub'sgh-ost). These tools build a shadow table, sync delta logs via triggers/binlogs asynchronously, and perform an atomic cutover in milliseconds.
Verbal Script & Mock Interview
Verbal Script: High-Concurrency Schema Design
Interviewer: "We are designing a database schema for an e-commerce platform that processes 1 million orders a day. During flash sales, thousands of users try to buy the same item concurrently. How do you design the schema to prevent race conditions like overselling, while keeping checkout transactions under 50ms?"
Candidate: "To design a database capable of handling $1\text{ Million}$ orders a day while keeping checkout times under $50\text{ ms}$, we must address two core scaling challenges: schema normalization boundaries and write lock contention.
First, I would design a highly normalized schema using InnoDB. A critical decision is separating Product Details from Product Pricing and Inventory.
To resolve the 'price change after purchase' vulnerability, I will design a product_prices table that tracks historic, interval-based pricing with valid_from and valid_until timestamps. When an order is placed, the price paid is frozen as an immutable copy directly within the order_items table.
Second, to eliminate the hot-key deadlock bottleneck during concurrent flash sales, I will completely avoid the anti-pattern of updating a single stock column in the products table. Instead, I will implement an Append-Only Inventory Ledger called inventory_transactions.
When a buyer completes a checkout, rather than acquiring a write lock on the product row, we append an atomic, row-level reservation insert. The query sums all historical transactions for that product and only succeeds if the available quantity is sufficient:
INSERT INTO inventory_transactions (product_id, delta, type, reference_id)
SELECT ?, -?, 'reservation', ?
FROM (
SELECT IFNULL(SUM(delta), 0) AS available
FROM inventory_transactions WHERE product_id = ?
) AS stock
WHERE stock.available >= ?;
If the rows affected returned is $1$, the transaction is committed; if $0$, we immediately abort and return an out-of-stock response, taking less than $10\text{ ms}$ with zero lock contention.
Third, I will configure a robust indexing strategy. For our highest-frequency queries, such as retrieving customer order history, I will implement composite indexes matching the query pattern. By creating an index on orders (user_id, created_at DESC), we cover both the filter and the sort, allowing the query to complete in single-digit milliseconds via index-only scans, bypassing expensive filesorts.
Finally, to handle analytical reporting without impacting OLTP checkout performance, I will configure semi-synchronous primary-replica replication. I will direct all transactional checkouts to the Primary node, while load-balancing catalog browsing and analytics queries to read replicas, protecting our checkout pipelines from database resource starvation."