Lesson 69 of 105 11 minFlagship

System Design: Designing Twitter (Timeline and News Feed)

A deep dive into the architecture of Twitter (now X). Learn how to handle millions of tweets per second using Fan-out on Write (Push) vs. Fan-out on Read (Pull) models, sharded databases, and real-time celebrity feed merging.

Reading Mode

Hide the curriculum rail and keep the lesson centered for focused reading.

Key Takeaways

  • **Hybrid Push/Pull Feed Mechanics:** Regular users use fan-out on write (push) for low-latency reads, while high-follower celebrity accounts bypass push to prevent DB thrashing, merging on-the-fly via fan-out on read (pull).
  • **Redis ZSET Precomputation:** Timelines are cached as Redis Sorted Sets (`ZSET`), capped at 800 items, using Tweet ID (Snowflake) as the score for fast retrieval.
  • **Geospatial Latency Mitigation:** Multi-region active-active clusters handle local writes with asynchronous replication, utilizing conflict-free replication topologies.
Recommended Prerequisites
System Design Interview FrameworkSystem Design: Designing a Distributed Task Scheduler

Premium outcome

From vague architecture answers to staff-level trade-off thinking.

Backend engineers preparing for senior, staff, and architecture rounds.

What you unlock

  • A reusable system design answer framework for ambiguous prompts
  • Clear language for consistency, scaling, and reliability trade-offs
  • Case-study depth across feeds, payments, storage, and messaging systems

Designing a real-time microblogging platform like Twitter (now X) is a flagship system design problem. The system must support massive read throughput, real-time message broadcasting, and rapid feed assembly under the constraint of extreme write spikes (e.g., during live sports events).

The core technical challenge is not storing millions of tweets, but delivering them to hundreds of millions of active users' home timelines with sub-second latency.


1. Core Requirements & Scale Constraints

Functional Requirements

  • Post a Tweet: Users can publish new text tweets (max 280 characters) with optional images or videos.
  • Home Timeline: Users can view a chronological feed of tweets from people they follow (read path).
  • User Timeline: Users can view their own historical tweets.
  • Follow/Unfollow: Users can follow other users, immediately updating their feed sources.

Non-Functional Constraints (SLAs)

  • High Availability: The read path must achieve $99.999%$ (Five Nines) availability.
  • Ultra-Low Latency: Home timeline generation must render in $\le 200\text{ms}$ at the $p99$ percentile.
  • Eventual Consistency: Tweet delivery to followers can lag up to 5 seconds under normal load.
  • Massive Scalability:
    • Daily Active Users (DAU): 500 Million.
    • Writes (Tweet publishing): Average 5,000 tweets per second (TPS), peak 25,000 TPS.
    • Reads (Timeline fetches): Average 500,000 queries per second (QPS).

Back-of-the-Envelope Estimates

  1. Storage Capacity (Tweets):
    • 500M tweets/day $\times$ 300 bytes (text + metadata) = 150 GB of text storage per day.
    • Over 5 years: $150\text{ GB/day} \times 365 \times 5 \approx 273.7\text{ TB}$.
  2. Media Storage Capacity:
    • Assume 10% of tweets contain images (1 MB avg) and 1% contain videos (10 MB avg).
    • Images: $50\text{M} \times 1\text{ MB} = 50\text{ TB/day}$.
    • Videos: $5\text{M} \times 10\text{ MB} = 50\text{ TB/day}$.
    • Total Media: 100 TB/day. Over 5 years (excluding deduplication): $\approx 182.5\text{ PB}$.
  3. Network Bandwidth:
    • Ingress (Writes): $100\text{ TB/day} \div 86400\text{ seconds} \approx 1.15\text{ GB/s}$ (9.2 Gbps).
    • Egress (Reads): Assuming each user views 10 timelines per day (each feed fetching 20 tweets with media): $\approx 115\text{ GB/s}$ (920 Gbps) read traffic.

2. API Design & Core Contracts

We expose standard REST API endpoints for key operations. All JSON payloads are serialized over HTTPS.

Post a Tweet

POST /v1/tweets
Authorization: Bearer <JWT_TOKEN>
Content-Type: application/json

{
  "text": "Architecting a hybrid push/pull timeline at scale! #systemdesign #scalability",
  "media_ids": ["img_987213987", "vid_872364812"]
}

Response:

{
  "status": "success",
  "data": {
    "tweet_id": "1423894712093847552",
    "user_id": "89374923",
    "text": "Architecting a hybrid push/pull timeline at scale! #systemdesign #scalability",
    "media_urls": [
      "https://media.codesprintpro.com/img_987213987.png"
    ],
    "created_at": "2026-05-22T10:45:00Z"
  }
}

Fetch Home Timeline

GET /v1/timeline/home?limit=20&cursor=1423894712093847552
Authorization: Bearer <JWT_TOKEN>

Response:

{
  "data": {
    "tweets": [
      {
        "tweet_id": "1423894723908234123",
        "author": {
          "user_id": "1009283",
          "name": "Staff Engineer"
        },
        "text": "Consistency always yields to availability in high-volume social feeds.",
        "created_at": "2026-05-22T10:46:12Z"
      }
    ],
    "next_cursor": "1423894709823749812"
  }
}

3. High-Level Design (HLD)

To handle the 100:1 read-to-write ratio, the system precomputes home timelines for active users. The overall architecture relies on a Hybrid Push/Pull model to handle celebrity accounts.

flowchart TD
    Client[Mobile/Web Client] -->|HTTPS| APIGateway[API Gateway / Envoy]
    
    %% Write Path (Push)
    APIGateway -->|Write Tweet| TweetService[Tweet Ingestion Service]
    TweetService -->|Write Log| Kafka[Kafka Event Bus]
    Kafka -->|Consume Event| FanoutWorkers[Fanout Workers]
    FanoutWorkers -->|Lookup Followers| FollowService[Social Graph Service]
    FanoutWorkers -->|Push Tweet ID| RedisCluster[(Redis Timeline Cache)]
    
    %% Storage layer
    TweetService -->|Save Metadata| DocDB[(ScyllaDB Tweet Store)]
    TweetService -->|Upload Assets| S3[(Amazon S3 Media Storage)]
    
    %% Read Path (Timeline Assembly)
    APIGateway -->|Read Feed| TimelineService[Timeline Service]
    TimelineService -->|Get Normal Feed| RedisCluster
    TimelineService -->|Check Celebrities| FollowService
    TimelineService -->|Merge Active Feeds| MergeEngine[Timeline Merger Engine]
    MergeEngine -->|Fetch Details| DocDB
    MergeEngine --> Client

4. Low-Level Design (LLD) & Data Models

Database Choice Rationale

  • Social Graph (Follows): We use a highly optimized relational database (e.g., PostgreSQL sharded by follower_id) or a graph database (e.g., Neo4j/Amazon Neptune). Due to the strict primary-key structure, sharded MySQL/Postgres is preferred for high-throughput node relationships.
  • Tweet Store: A wide-column store like ScyllaDB or Cassandra is perfect. It supports ultra-high write speeds, horizontal sharding, and structured query retrieval by clustering key.
  • Timeline Cache: Redis Cluster holds active user feeds in-memory using Sorted Sets (ZSET), where the key is the user_id, the member is the tweet_id, and the score is the tweet_id (Snowflake ID encodes timestamp).

SQL Table DDL declarations

Sharded Social Graph Schema (MySQL/PostgreSQL)

CREATE TABLE follows (
    follower_id BIGINT NOT NULL,
    followee_id BIGINT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id),
    KEY idx_followee (followee_id)
) ENGINE=InnoDB;

Sharding Strategy: Shard the follows table by follower_id using a consistent hashing algorithm. This ensures all followee relationships for a single user reside in the same physical database node, allowing immediate retrieval of followee lists.

ScyllaDB Tweet Store Schema

CREATE KEYSPACE tweet_keyspace WITH replication = {
    'class': 'NetworkTopologyStrategy', 
    'us-east': 3, 
    'us-west': 3
};

CREATE TABLE tweet_keyspace.tweets (
    user_id bigint,
    tweet_id bigint,
    content varchar,
    media_urls list<varchar>,
    created_at timestamp,
    PRIMARY KEY (user_id, tweet_id)
) WITH CLUSTERING ORDER BY (tweet_id DESC);

Partitioning Strategy: Partitioned by user_id to ensure all tweets from a single author are stored together. The clustering key tweet_id allows immediate time-ordered range scans for a user's profile feed.


5. Scaling Challenges & System Bottlenecks

The Fan-out Celebrity Problem (Hot Spots)

If a user with 80 Million followers (like Elon Musk or Cristiano Ronaldo) tweets, a pure Fan-out on Write model forces the Fanout Workers to write that tweet ID to 80 million separate Redis timeline caches. This causes:

  • Massive Redis CPU spikes.
  • Replication queue backpressure.
  • Out-of-memory crashes due to write amplification.

To mitigate this, we classify users into two tiers:

  1. Standard Users (Follower count < 25,000): Use Fan-out on Write (Push). When they tweet, their tweet ID is pushed to all their followers' Redis caches immediately.
  2. Celebrity Users (Follower count >= 25,000): Use Fan-out on Read (Pull). When they tweet, the write path completely bypasses fanout. The tweet is only committed to the ScyllaDB Tweet Store.

When a reader fetches their feed, the Timeline Merger Engine performs a hybrid merge:

  1. Fetch the user's precomputed home timeline from their Redis ZSET.
  2. Interrogate the Social Graph Service to find what celebrities the user follows.
  3. Query the ScyllaDB Tweet Store for the latest tweets of those followed celebrities.
  4. Merge the celebrity tweets with the precomputed normal timeline in-memory based on Tweet ID, sorting them chronologically.
sequenceDiagram
    autonumber
    actor User as Active User
    participant TS as Timeline Service
    participant Cache as Redis Timeline Cache
    participant SG as Social Graph DB
    participant DB as ScyllaDB Tweet Store

    User->>TS: GET /v1/timeline/home
    TS->>Cache: Fetch precomputed timeline (ZSET)
    Cache-->>TS: Return normal users' tweet IDs
    TS->>SG: Get followed celebrities list
    SG-->>TS: Return [Celebrity_A, Celebrity_B]
    TS->>DB: Fetch recent tweets of [Celebrity_A, Celebrity_B]
    DB-->>TS: Return celebrity tweets
    TS->>TS: Merge & sort chronologically (in-memory)
    TS-->>User: Return complete unified feed

6. Operational Trade-offs & CAP Theorem Realities

Consistency vs. Latency in Social Media Feeds

In designing Twitter's timeline feed, we choose an Availability-Consistent (AP) model over strict consistency (CP). Having every follower see a tweet at the exact same millisecond is completely unnecessary.

  • Push Path Latency: Pushing a tweet to 100,000 followers asynchronously through Kafka takes about 500ms to 2 seconds under normal conditions. This represents eventual consistency.
  • In-Memory Cache Eviction: Caching home timelines for all 500M DAU in Redis requires immense memory resources. To balance cost and performance, we apply an Active User Cache Eviction Strategy:
    • We only store precomputed timelines in Redis for users who have logged in within the last 15 days.
    • For inactive users (who log back in after 15 days), their timeline cache is empty. The system intercepts this "cache miss," reads the follow graph from PostgreSQL, queries ScyllaDB for the latest tweets from all followed users, and performs a complete on-the-fly rebuild of the Redis ZSET feed.

Replication Lag Challenges

When a user unfollows another user, we must immediately invalidate the unfollowed user's tweets from the follower's timeline. In a pure Push system, this requires an asynchronous worker to iterate through the follower's Redis ZSET and purge all tweet IDs belonging to the unfollowed user.

Under severe network partitions, if the Graph Database primary node processes the unfollow, but the replica utilized by the asynchronous fanout worker lags by several seconds, the system will read stale states and attempt to re-push tweets, leading to a "ghost tweet" glitch where unfollowed content continues to show up. To mitigate this, we stamp each Redis timeline cache with a local follow_epoch_version token. If the cache epoch mismatch is detected, the home timeline is dynamically invalidated and fully rebuilt from the sharded databases on-the-fly.


7. Failure Scenarios & High-Availability Resilience

A. Redis Cache Node Crash (Thundering Herd Prevention)

If a major Redis cache shard holding 5 million precomputed timelines crashes, all timeline read requests for those 5 million users would fall back to sharded database queries. This sudden influx of high-volume disk-join queries would instantly lock up database connection pools, taking down the entire site.

Mitigation:

  1. Fallback Standby Replicas: Every Redis master node is backed by an active-passive hot standby replica. If the master fails, Sentinel coordinates a failover in < 5 seconds.
  2. Dynamic Query Rate-Limiting & Jitter: When a cache miss occurs, the system utilizes a Distributed Lock with TTL (using Redis Redlock) so that only one background thread is allowed to query ScyllaDB to rebuild the timeline cache, while other concurrent requests serve a stale cached version or a cached subset with a random backoff jitter.

B. Kafka Fanout Worker Queue Backpressure

During a massive viral event (e.g. World Cup Final), write spikes scale to 30,000 tweets per second. If the Fanout Workers get saturated, the Kafka processing queue grows exponentially. Users will post tweets, but their followers won't see them on their timeline for minutes or hours.

Mitigation:

  1. Auto-Scaling Consumer Groups: Set up Kubernetes HPA (Horizontal Pod Autoscaler) monitoring the Kafka Consumer Lag metric. If consumer lag exceeds 50,000 messages, spin up additional Fanout Worker containers dynamically.
  2. Shedding Non-Critical Work: If lag continues to grow, degrade the fanout quality by bypassing normal push completely for all users with > 5,000 followers (temporarily lowering the celebrity threshold from 25,000), switching them to the Pull path to preserve queue throughput.
flowchart TD
    QueueLag{Kafka Consumer Lag > Threshold?}
    QueueLag -->|Yes| ScaleWorkers[Autoscale Fanout Workers]
    QueueLag -->|Critical Lag| LowerThreshold[Lower Celebrity Threshold to 5K]
    ScaleWorkers --> Rebalance[Consumer Group Rebalances Partitions]
    LowerThreshold --> PushToPull[Switch Medium Users to Pull Model]
    PushToPull --> ClearQueue[Reduce Fanout Write Amplification]

8. Candidate Verbal Script (Interview Guide)

Interviewer: "We have a working hybrid push/pull timeline feed. However, what happens during huge viral events—like the World Cup final—where millions of users post tweets concurrently, and read QPS scales to 3 million requests per second? How do you prevent cache starvation?"

Candidate: "To handle a massive concurrent write and read spike of this magnitude, we must apply layered caching, rate-limiting, and partition-isolation techniques.

First, we must prevent Cache Stampedes. If our Redis caches expire or get evicted under memory pressure during the peak of the event, the timeline service will attempt to fetch tweets directly from ScyllaDB. This would cause immediate database connection pool starvation. I would implement a probabilistic early expiration algorithm (XFetch) on our cached feeds. Instead of a hard TTL, when a cached timeline is read near its expiration, the background thread fires an asynchronous query to refresh the cache before it dies, ensuring a $100%$ cache hit rate.

Second, we must isolate celebrity hot-spot partitions. When a celebrity is tweeting multiple times per minute during a viral event, we can deploy a Local In-Memory Cache (e.g., Caffeine Cache) directly inside the container memory of our Timeline Services. This shields ScyllaDB and Redis from millions of concurrent queries for that single celebrity’s tweet payload.

Finally, we apply ingress rate-limiting and backpressure at the API Gateway. We configure token bucket limits based on tier levels, dropping non-essential background payloads (like profile updates or read-receipt logs) so that the core feed delivery path maintains $100%$ processing priority."

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.