The Schema Design Challenge
Mental Model
Database schema design is architecture. Like all architecture decisions, it must be driven by the query patterns and access patterns you actually have — not the ones you imagine you might need.
A 40-person startup processes 1 million orders per day on their e-commerce platform. The initial schema was designed in a weekend, and now it is causing production problems at scale:
- Checkout queries take 8 seconds
- Inventory updates cause deadlocks during flash sales
- The analytics team can't run reports without bringing down production
- Every new feature requires a downtime window for schema changes
This case study walks through redesigning the schema to handle current load and future growth — and the process of getting there without a maintenance window.
The Starting Point: The Naive Schema
Here is the schema that got the company to $10M ARR but can't scale further:
-- The original, problematic schema
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
description TEXT,
price DECIMAL(10,2),
stock INT, -- ← Problem: single stock column, no history
category VARCHAR(100),
created_at DATETIME,
deleted TINYINT(1) -- ← Problem: non-standard soft delete
);
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT,
product_list TEXT, -- ← CRITICAL problem: JSON blob in a text column
total DECIMAL(10,2),
status VARCHAR(50),
address TEXT, -- ← Problem: denormalized, unvalidated
created_at DATETIME
);
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
email VARCHAR(255), -- ← No unique constraint!
password_hash VARCHAR(255),
created_at DATETIME
);
The problems with this schema:
orders.product_list TEXT— storing structured data as text kills MySQL's ability to query it. Every order analysis requires parsing text in application code.products.stock INT— no inventory history. When stock goes negative (race condition in flash sales), there's no audit trail.- No foreign key constraints — the database allows orphaned orders with non-existent user IDs.
- No indexes beyond primary keys — the 8-second checkout query does full table scans.
The Production-Ready Schema
Core Tables
-- Users: with unique constraint, standardized columns
CREATE TABLE users (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(320) NOT NULL, -- RFC 5321 max length
name VARCHAR(255) NOT NULL,
status ENUM('active', 'suspended', 'deleted') NOT NULL DEFAULT 'active',
-- Audit columns (mandatory on every table)
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
updated_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
deleted_at DATETIME(3) NULL DEFAULT NULL, -- soft delete
CONSTRAINT uq_users_email UNIQUE (email) -- prevents duplicate accounts
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
-- Addresses: normalized out of orders
CREATE TABLE addresses (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
user_id BIGINT UNSIGNED NOT NULL,
line1 VARCHAR(255) NOT NULL,
line2 VARCHAR(255),
city VARCHAR(100) NOT NULL,
state VARCHAR(100) NOT NULL,
postal_code VARCHAR(20) NOT NULL,
country CHAR(2) NOT NULL, -- ISO 3166-1 alpha-2
is_default BOOLEAN NOT NULL DEFAULT FALSE,
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
deleted_at DATETIME(3) NULL DEFAULT NULL,
CONSTRAINT fk_addresses_user FOREIGN KEY (user_id) REFERENCES users(id),
INDEX idx_addresses_user_id (user_id)
) ENGINE=InnoDB;
-- Products: with proper status enum
CREATE TABLE products (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
sku VARCHAR(64) NOT NULL, -- merchant product code
name VARCHAR(512) NOT NULL,
description TEXT,
category_id BIGINT UNSIGNED NOT NULL,
status ENUM('active', 'draft', 'discontinued') NOT NULL DEFAULT 'draft',
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
updated_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
deleted_at DATETIME(3) NULL DEFAULT NULL,
CONSTRAINT uq_products_sku UNIQUE (sku),
INDEX idx_products_category_status (category_id, status), -- compound: category browsing
FULLTEXT INDEX ft_products_name (name) -- full-text search
) ENGINE=InnoDB;
The Price Table (Solved the "Price Changed After Order" Problem)
-- Prices: separate from products, supports historical pricing
CREATE TABLE product_prices (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
product_id BIGINT UNSIGNED NOT NULL,
price DECIMAL(12,4) NOT NULL, -- 4 decimal places for currency math
currency CHAR(3) NOT NULL DEFAULT 'USD', -- ISO 4217
valid_from DATETIME(3) NOT NULL,
valid_until DATETIME(3) NULL, -- NULL = currently active
CONSTRAINT fk_prices_product FOREIGN KEY (product_id) REFERENCES products(id),
INDEX idx_prices_product_valid (product_id, valid_from, valid_until)
) ENGINE=InnoDB;
-- Current price view (for application use)
CREATE VIEW current_prices AS
SELECT p.id AS product_id, pp.price, pp.currency
FROM products p
JOIN product_prices pp ON pp.product_id = p.id
AND pp.valid_from <= NOW()
AND (pp.valid_until IS NULL OR pp.valid_until > NOW());
The Inventory System (Solved the Flash Sale Race Condition)
-- Inventory: append-only ledger, never update stock directly
CREATE TABLE inventory_transactions (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
product_id BIGINT UNSIGNED NOT NULL,
variant_id BIGINT UNSIGNED,
delta INT NOT NULL, -- positive = stock added, negative = reserved
type ENUM('purchase', 'sale', 'return', 'adjustment', 'reservation') NOT NULL,
reference_id BIGINT UNSIGNED, -- order_id or purchase_order_id
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
created_by BIGINT UNSIGNED NOT NULL, -- user_id of who made the change
CONSTRAINT fk_inv_product FOREIGN KEY (product_id) REFERENCES products(id),
INDEX idx_inv_product_created (product_id, created_at)
) ENGINE=InnoDB;
-- Current stock is a derived view
CREATE VIEW current_inventory AS
SELECT
product_id,
SUM(delta) AS stock_quantity
FROM inventory_transactions
GROUP BY product_id;
-- Safe inventory reservation (prevents overselling)
-- Run in a transaction:
INSERT INTO inventory_transactions (product_id, delta, type, reference_id, created_by)
SELECT ?, -?, 'reservation', ?, ?
FROM (
SELECT SUM(delta) AS available FROM inventory_transactions WHERE product_id = ?
) AS stock
WHERE stock.available >= ?;
-- If 0 rows affected: insufficient stock
-- If 1 row affected: reservation successful
Orders (The Core Transactional Table)
-- Orders: normalized with proper line items
CREATE TABLE orders (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
user_id BIGINT UNSIGNED NOT NULL,
shipping_address_id BIGINT UNSIGNED NOT NULL,
status ENUM('pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded')
NOT NULL DEFAULT 'pending',
currency CHAR(3) NOT NULL DEFAULT 'USD',
subtotal DECIMAL(12,4) NOT NULL,
tax_amount DECIMAL(12,4) NOT NULL DEFAULT 0,
shipping_amount DECIMAL(12,4) NOT NULL DEFAULT 0,
total_amount DECIMAL(12,4) NOT NULL,
-- Payment tracking
payment_intent_id VARCHAR(255), -- Stripe PaymentIntent ID
paid_at DATETIME(3),
-- Audit
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
updated_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
cancelled_at DATETIME(3),
cancel_reason VARCHAR(500),
CONSTRAINT fk_orders_user FOREIGN KEY (user_id) REFERENCES users(id),
CONSTRAINT fk_orders_address FOREIGN KEY (shipping_address_id) REFERENCES addresses(id),
INDEX idx_orders_user_status (user_id, status), -- "my orders" query
INDEX idx_orders_status_created (status, created_at), -- order processing queue
INDEX idx_orders_payment_intent (payment_intent_id) -- Stripe webhook lookup
) ENGINE=InnoDB;
-- Order Line Items: replaces the product_list TEXT blob
CREATE TABLE order_items (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
order_id BIGINT UNSIGNED NOT NULL,
product_id BIGINT UNSIGNED NOT NULL,
quantity INT UNSIGNED NOT NULL,
unit_price DECIMAL(12,4) NOT NULL, -- price at time of purchase (snapshot)
line_total DECIMAL(12,4) NOT NULL,
CONSTRAINT fk_items_order FOREIGN KEY (order_id) REFERENCES orders(id),
CONSTRAINT fk_items_product FOREIGN KEY (product_id) REFERENCES products(id),
INDEX idx_items_order (order_id), -- get all items for an order
INDEX idx_items_product (product_id) -- sales analytics per product
) ENGINE=InnoDB;
The Indexing Strategy
Every index must justify its existence. Here is the reasoning for each index:
-- Query: "Show me my order history" (most frequent query, runs on every page load)
-- Pattern: WHERE user_id = ? ORDER BY created_at DESC LIMIT 20
-- Index: (user_id, created_at) -- covers both the filter and the sort
INDEX idx_orders_user_created (user_id, created_at DESC)
-- Query: "Pending orders for the processing queue" (runs every 30 seconds)
-- Pattern: WHERE status = 'pending' ORDER BY created_at ASC LIMIT 100
-- Index: (status, created_at) -- covers filter and sort
INDEX idx_orders_status_created (status, created_at)
-- Query: "Has this user bought this product?" (recommended-upsell query)
-- Pattern: WHERE user_id = ? AND product_id = ?
-- Index: (user_id, product_id) in order_items
INDEX idx_items_user_product (order_id) -- join through orders
-- ↑ Actually: use covering index by joining
The composite index column order rule: Put the equality condition columns first, then the range/sort columns. MySQL uses the index left-to-right and stops at the first range condition.
-- WRONG for this query: SELECT * FROM orders WHERE status = 'pending' AND created_at > '2026-01-01'
INDEX (created_at, status) -- MySQL uses created_at for range, can't use status after range
-- CORRECT:
INDEX (status, created_at) -- MySQL uses status for equality, created_at for range/sort
Zero-Downtime Schema Migrations
Every schema change at production scale must use the expand-contract pattern:
-- Example: Adding a 'notes' column to orders
-- STEP 1: Expand (backward compatible addition)
-- Deploy this first, with no application changes
ALTER TABLE orders
ADD COLUMN notes TEXT NULL DEFAULT NULL,
ALGORITHM=INSTANT; -- MySQL 8.0: instant for adding nullable columns
-- STEP 2: Update the application (reads and writes new column)
-- Deploy the application code that uses notes
-- Old and new code both work with the current schema
-- STEP 3: Contract (remove old pattern, if applicable)
-- No cleanup needed for adding a column, but if you're REMOVING a column:
-- 1. Remove all code references to the column
-- 2. Deploy
-- 3. Drop the column
ALTER TABLE orders DROP COLUMN old_column;
Migrations that require gh-ost or pt-online-schema-change:
- Adding an index to a large table (MySQL rebuilds the table)
- Changing a column's data type
- Adding a NOT NULL column without a DEFAULT
# Add an index to the orders table without locking (using pt-osc)
pt-online-schema-change \
--alter "ADD INDEX idx_orders_new_column (new_column)" \
--host=prod-primary.internal \
--database=ecommerce \
--table=orders \
--execute
Query Performance Before and After
| Query | Before | After | Change |
|---|---|---|---|
| User order history (20 rows) | 8,230ms | 4ms | 2,057x faster |
| Order processing queue | 1,200ms | 12ms | 100x faster |
| Product inventory check | 340ms | 8ms | 42x faster |
| Weekly revenue report | 45s | 2.1s | 21x faster |
The revenue report improvement came from moving analytics queries to a read replica and adding the appropriate covering index:
-- Analytics: weekly revenue by category
CREATE INDEX idx_orders_paid_at_status ON orders (paid_at, status);
CREATE INDEX idx_items_product_created ON order_items (product_id, order_id);
-- + denormalized category_id in order_items for reporting
The Migration Strategy (How to Get There)
Getting from the bad schema to the good one without downtime:
Week 1: Add new tables alongside old ones
- Create new orders_v2, order_items_v2 tables
- Dual-write: application writes to both old and new tables
Week 2: Migrate historical data
- Run a background job to backfill order_items_v2 from orders.product_list
- Verify row counts match
Week 3: Read from new tables
- Switch read queries to new tables
- Keep writing to both for 1 week as a safety net
Week 4: Deprecate old tables
- Stop writing to old tables
- Rename: orders → orders_deprecated_20260520
- Run for 30 days, then DROP
Key Takeaways
- Normalize first, then denormalize only where query patterns prove it is necessary.
- Composite indexes must match your query patterns exactly — the column order is not arbitrary.
- Every production schema needs soft deletes, audit columns, and a migration strategy that supports zero-downtime deploys.