Mental Model
GROUP BYsquashes your data into a summary. A Window Function gives you the summary while letting you keep all the original detail rows intact.
Introduced in MySQL 8.0, Window Functions revolutionized how we write analytical queries. Before this, calculating a "running total" or "top 3 per category" required complex self-joins or temporary tables. Now, it's elegant and native.
1. The Syntax of a Window Function
A window function requires the OVER() clause.
SELECT
employee_id,
department,
salary,
AVG(salary) OVER(PARTITION BY department) as dept_avg_salary
FROM employees;
Notice that we get to see every single employee's row, but we also get the average salary of their specific department attached to that row. GROUP BY would have destroyed the employee_id level of detail.
2. Ranking Functions (The Interview Favorites)
Ranking data within groups is the most common use case for window functions, and a guaranteed interview question.
- ROW_NUMBER(): Assigns a unique, sequential integer to each row in the partition (1, 2, 3, 4).
- RANK(): If there's a tie, they get the same rank, but the next number is skipped (1, 1, 3, 4).
- DENSE_RANK(): If there's a tie, they get the same rank, but the next number is NOT skipped (1, 1, 2, 3).
Example: Top 2 highest-paid employees per department
WITH RankedEmployees AS (
SELECT
name,
department,
salary,
DENSE_RANK() OVER(PARTITION BY department ORDER BY salary DESC) as salary_rank
FROM employees
)
SELECT * FROM RankedEmployees WHERE salary_rank <= 2;
3. Navigating Rows: LAG() and LEAD()
If you are building financial dashboards or tracking changes over time, you often need to compare a row to the row before it (e.g., Month-over-Month growth).
- LAG(): Looks at the previous row.
- LEAD(): Looks at the next row.
SELECT
month,
revenue,
LAG(revenue, 1) OVER(ORDER BY month) as prev_month_revenue,
revenue - LAG(revenue, 1) OVER(ORDER BY month) as monthly_growth
FROM monthly_sales;
Practice Question
Scenario: You have a sales table. You need to calculate a "Running Total" of revenue ordered by date.
Question: Write the window function to calculate the running total.
View Answer
SELECT
date,
revenue,
SUM(revenue) OVER(ORDER BY date) as running_total
FROM sales;
When you use an aggregate function like SUM() with OVER(ORDER BY ...), MySQL automatically applies it as a cumulative sum from the start of the window up to the current row.
Technical Trade-offs: Architectural Decision
| Strategy | Scalability | Complexity | Operational Cost | Performance |
|---|---|---|---|---|
| Monolithic | Low | Low | Low | Fast (Local) |
| Microservices | Very High | High | High | Slower (Network) |
| Serverless | Infinite | Medium | Variable | Variable (Cold Starts) |
Production Readiness Checklist
Before deploying this architecture to a production environment, ensure the following Staff-level criteria are met:
- High Availability: Have we eliminated single points of failure across all layers?
- Observability: Are we exporting structured JSON logs, custom Prometheus metrics, and OpenTelemetry traces?
- Circuit Breaking: Do all synchronous service-to-service calls have timeouts and fallbacks (e.g., via Resilience4j)?
- Idempotency: Can our APIs handle retries safely without causing duplicate side effects?
- Backpressure: Does the system gracefully degrade or return HTTP 429 when resources are saturated?
Verbal Interview Script
Interviewer: "How would you find the 3rd highest salary in the company without using LIMIT or Window Functions?"
Candidate: "While Window Functions like DENSE_RANK() are the modern and most efficient way to solve Top-N problems, in legacy systems without them, we can use a correlated subquery. We would query the distinct salaries and count how many salaries are strictly greater than the current one. If exactly 2 salaries are greater, then the current one is the 3rd highest. However, this is an $O(N^2)$ operation and scales terribly, which is exactly why Window Functions were introduced."