Case Study: Designing a Content Moderation System (Meta/TikTok Scale)
Mental Model
Content moderation at exabyte scale is not a simple database lookup or a synchronous API call to a machine learning endpoint. It is a highly optimized, event-driven streaming pipeline that filters content in tiers—using sub-millisecond cryptographic and perceptual hashing to eliminate known violations, routing unknown media through asynchronous GPU inference clusters, and allocating human review capacity based on content risk and real-time virality vectors.
Requirements & Design Constraints
We need to build a content moderation pipeline capable of filtering user-uploaded text, images, and videos on a major global social platform.
Functional Requirements
- Multi-Media Processing: Moderate incoming text posts, static images, and video uploads.
- Tiered Filtering: Instantly block known harmful content (e.g., illegal material or violent media) before running advanced AI models.
- Asynchronous ML Analysis: Score media across multiple violation dimensions (e.g., hate speech, violence, nudity, fraud).
- Human-in-the-Loop Escalation: Route cases where AI models are highly uncertain to human review queues.
- Flexible Action Execution: Support diverse enforcement actions, including instant deletion, shadowbanning, user warning flags, and IP rate limits.
Non-Functional SLAs
- Massive Ingress Scale: Sustain a continuous load of 100,000 uploads per second globally.
- Sub-Second Latency:
- Synchronous blocking filter (known blacklists): Under 50ms.
- Asynchronous ML pipeline evaluation: Under 2 seconds.
- Human Queue SLA: High-risk viral content must reach a human moderator within 5 minutes.
- Precision & Recall Targets: Achieve a false positive rate of less than 0.01% to prevent removing safe user content.
Back-of-the-Envelope Capacity Estimates
Let's estimate the volumetric data footprint and GPU computational power required to process 100,000 uploads per second.
1. Ingress Data Composition
- Total Inbound Throughput: $100,000\text{ uploads/second}$
- Traffic Composition:
- Text Posts (80%): $80,000\text{ uploads/s}$ (Average size: $500\text{ bytes}$).
- Static Images (19%): $19,000\text{ uploads/s}$ (Average size: $1\text{ MB}$).
- Videos (1%): $1,000\text{ uploads/s}$ (Average size: $20\text{ MB}$, average length: 15 seconds).
- Ingress Volumetrics:
- Text Bandwidth: $80,000 \times 500\text{ bytes} = 40\text{ MB/s}$
- Image Bandwidth: $19,000 \times 1\text{ MB} = 19\text{ GB/s}$
- Video Bandwidth: $1,000 \times 20\text{ MB} = 20\text{ GB/s}$
- Total Network Ingress: $\approx 39.04\text{ GB/s}$ continuous bandwidth capacity.
2. GPU Cluster Computation Planning (Images Only)
- Running an image classification model (e.g., a custom ResNet or CLIP model) takes $50\text{ ms}$ of GPU time per image.
- Total GPU-Seconds Required Per Second: $19,000\text{ images/s} \times 0.05\text{ s} = 950\text{ GPU-seconds/s}$.
- To prevent queue build-ups and handle peak bursts (2x standard load), we apply a redundancy multiplier:
- $\text{Total Active GPUs} = 950 \times 2 = 1,900\text{ Enterprise GPUs}$ (e.g., NVIDIA A100/H100 specs) dedicated to image ML pipelines.
- Mitigation: To reduce this multi-million dollar GPU cost, we must intercept up to 70% of duplicate or slightly modified uploads using cheap CPU-bound perceptual hashing before the ML phase.
API Design & Core Contracts
The system exposes high-performance REST and gRPC endpoints for media ingestion and human-escalation webhooks.
1. Ingest Media for Moderation
Invoked by the core media service immediately after a user uploads a file.
POST /api/v1/moderation/inspect
Request Payload:
{
"user_id": "usr_9831478",
"media_id": "med_8923147",
"media_type": "image",
"media_url": "https://storage.codesprintpro.com/media/usr_9831478/photo.webp",
"metadata": {
"ip_address": "198.51.100.42",
"timestamp": 1779435420000
}
}
Response Payload (Immediate Sync Output):
{
"status": "APPROVED_SYNC",
"inspection_ticket_id": "tkt_mod_092314897",
"action_taken": "NONE",
"async_pipeline_triggered": true
}
2. Human Review Outcome Webhook
Triggered by the Human Review Admin Panel to notify the core service of a moderator's decision.
POST /api/v1/moderation/action-webhook
Request Payload:
{
"inspection_ticket_id": "tkt_mod_092314897",
"reviewer_id": "rev_agent_2841",
"decision": "REMOVE_BLOCK",
"reason_code": "VIOLENCE_GRAPHIC",
"user_penalty_applied": "WARNING_STRIKE_1"
}
Response Payload:
{
"status": "success",
"action_propagated_ms": 140
}
High-Level Design (HLD)
To handle 100,000 QPS, we structure our pipeline as a decoupled, event-driven streaming system.
1. Real-time Content Moderation Streaming Pipeline HLD
This flowchart demonstrates how uploaded media flows through fast hash filters, parallel ML workers, and human-in-the-loop escalation queues.
graph TD
Client[User Client App] -->|1. Upload Post| API[API Gateway / Ingest Service]
API -->|2. Save Media| BlobStore[(Object Storage: S3)]
API -->|3. Publish Ingestion Event| Kafka[Apache Kafka: Raw Uploads Topic]
subgraph Synchronous Fast Path (Under 50ms)
Kafka -->|4. Consume| FastFilter[Exact & Perceptual Hash Filter]
FastFilter -->|5. Match Blacklist| ActionEngine[Policy & Enforcement Engine]
ActionEngine -->|6. Instant Delete / Shadowban| Client
end
subgraph Asynchronous ML Pipeline (Under 2s)
FastFilter -->|7. Hash Cache Miss| MLRouter[ML Inference Router]
MLRouter -->|NLP Text Model| TextGPU[GPU Cluster A: BERT/LLM]
MLRouter -->|Image Classifier| ImageGPU[GPU Cluster B: CLIP/ResNet]
MLRouter -->|Video Keyframe Sampler| VideoGPU[GPU Cluster C: Video CNN]
end
TextGPU -->|Scores| Aggregator[Moderation Decision Aggregator]
ImageGPU -->|Scores| Aggregator
VideoGPU -->|Scores| Aggregator
Aggregator -->|Low Risk| Approve[Publish Active Post]
Aggregator -->|High Risk| ActionEngine
Aggregator -->|Medium Risk (Unsure: 0.4 - 0.7)| HumanQueue[Escalation Router]
2. Virality-Weighted Human Escalation Routing Flow
When AI is uncertain, we prioritize human review queues by cross-referencing risk scores with real-time post virality.
graph TD
EscalationRouter[Escalation Router] -->|1. Parse Event| ScoreEngine[Virality Priority Calculator]
ScoreEngine -->|2. Read Metrics| Analytics[(Real-time Views/Shares DB)]
ScoreEngine -->|3. Calculate priority = Risk * Log(Views)| PriorityQueue[Elasticsearch Priority Review Index]
AdminPanel[Moderator Admin Panel] -->|4. Pull next highest priority| WorkerPool[Moderator Agent Pools]
WorkerPool -->|5. Submit Decision| ActionEngine[Policy & Enforcement Engine]
ActionEngine -->|6. Purge Blob / Ban User| BlobStore[(Object Storage: S3)]
ActionEngine -->|7. Active Learning Feedback| RetrainDB[(ML Training Pipeline DB)]
Low-Level Design (LLD) & Data Models
Let's dive into the core optimization component: Perceptual Hashing (pHash).
Perceptual Hashing vs Cryptographic Hashing
- Cryptographic Hash (MD5/SHA-256): Extremely sensitive. If a user resizes an image by 1 pixel or changes a single byte of metadata, the MD5 hash changes completely. It is useless for catching modified illegal media.
- Perceptual Hash (pHash): Analyzes the visual structure of the image (typically using Discrete Cosine Transform - DCT). It generates a 64-bit integer representing the visual layout.
- Hamming Distance: To compare two pHashes, we calculate the number of differing bits using a XOR bitwise operation.
- Similarity Rule: If the Hamming Distance between the pHash of an uploaded image and a blacklisted image is less than 10, the images are visually near-identical, allowing us to intercept modifications instantly on the CPU.
Python Perceptual Hash Deduplicator Implementation
Below is a production-grade, compilable Python implementation of a thread-safe PerceptualHashDeduplicator utilizing Hamming distance calculations and bit-mask scanning.
import hashlib
from typing import Dict, List, Optional, Set, Tuple
class PerceptualHashDeduplicator:
def __init__(self, match_threshold: int = 8):
"""
Initialize the Deduplicator.
:param match_threshold: Max Hamming distance to consider two images visually identical.
"""
self.threshold = match_threshold
# Simulated database index: 64-bit pHash (int) -> Metadata info
self.blacklist_index: Dict[int, str] = {}
def hamming_distance(self, hash1: int, hash2: int) -> int:
"""
Calculates the Hamming distance between two 64-bit integers.
This represents the number of differing bits.
"""
# XOR returns 1 for differing bits. bin().count("1") sums them up.
xor_result = hash1 ^ hash2
return bin(xor_result).count("1")
def register_to_blacklist(self, image_phash: int, violation_reason: str):
"""
Registers a known violating image pHash to the local database cache.
"""
self.blacklist_index[image_phash] = violation_reason
def check_exact_match(self, raw_bytes: bytes, exact_hash_db: Set[str]) -> bool:
"""
Performs a rapid MD5 exact match check. Takes under 1ms.
"""
md5_hash = hashlib.md5(raw_bytes).hexdigest()
return md5_hash in exact_hash_db
def check_fuzzy_match(self, query_phash: int) -> Tuple[bool, Optional[str], int]:
"""
Scans the indexed pHashes to find visually similar violating images.
"""
best_distance = 64
matched_reason = None
match_found = False
for stored_hash, reason in self.blacklist_index.items():
distance = self.hamming_distance(query_phash, stored_hash)
if distance <= self.threshold:
if distance < best_distance:
best_distance = distance
matched_reason = reason
match_found = True
return match_found, matched_reason, best_distance
# Example Verification:
if __name__ == "__main__":
deduplicator = PerceptualHashDeduplicator(match_threshold=8)
# 1. Register a banned image (e.g. violent media)
# Binary representation: 10101010 ... (64-bit integer representation)
banned_phash = 0b1010101010101010101010101010101011110000111100001111000011110000
deduplicator.register_to_blacklist(banned_phash, "GRAPHIC_VIOLENCE")
# 2. Case A: User uploads a modified version (slightly cropped, changing a few bits)
# We change 4 bits near the end (distance = 4)
uploaded_phash_A = 0b1010101010101010101010101010101011110000111100001111000011111111
# 3. Case B: User uploads a completely safe, unrelated image (completely different layout)
uploaded_phash_B = 0b0000111100001111000011110000111100001111000011110000111100001111
# Run fuzzy checks
match_A, reason_A, dist_A = deduplicator.check_fuzzy_match(uploaded_phash_A)
match_B, reason_B, dist_B = deduplicator.check_fuzzy_match(uploaded_phash_B)
print(f"Image A Check: Match Found={match_A}, Reason={reason_A}, Hamming Distance={dist_A}")
print(f"Image B Check: Match Found={match_B}, Reason={reason_B}, Hamming Distance={dist_B}")
# Assertions for correctness
assert match_A is True and reason_A == "GRAPHIC_VIOLENCE", "Deduplicator failed to catch cropped violating image!"
assert match_B is False, "Deduplicator flagged safe image as violation!"
print("Success: Perceptual deduplicator accurately detected visual modification while ignoring safe media.")
Scaling Nuances & Pipeline Optimizations
Operating a streaming moderation network at 100,000 QPS requires strategic pipeline optimizations.
1. High-Performance Video Keyframe Sampling
Running ML models on 30 frames per second of video represents a massive waste of GPU cycles.
- Our Optimization:
- Sampling: We split the video into a 1 frame per second stream, discarding redundant intermediate frames.
- Audio Track Isolation: We extract the audio track, convert it to a WAV chunk, and pass it to a lightweight Speech-to-Text (ASR) engine (e.g. Whisper) to generate a text transcript.
- Tiered NLP routing: The text transcript is analyzed by fast CPU-bound NLP regex patterns and light models. If the transcript contains violent language, the video is instantly flagged for review without ever decoding or running heavy computer vision models on the video frames.
2. Virality-Weighted Human Queues
If our AI uncertainty matches 10,000 items per hour, a flat FIFO (First-In, First-Out) queue would cause dangerous delays. A highly harmful post uploaded by a famous celebrity could get stuck behind a harmless post uploaded by a user with zero followers.
- Mitigation: We calculate a dynamic Priority Score for each item in the queue:
$$\text{Priority Score} = \text{ML Risk Confidence} \times \log_{10}(\text{Real-time View Velocity})$$
- A post with a medium risk score (0.5) that is gaining 10,000 views per minute is immediately boosted to the front of the queue, reaching a human moderator in seconds.
- A post with the same risk score but 0 views remains in the low-priority backlog.
3. Active Learning Feedback Loops
System accuracy degrades as bad actors adapt and create new types of harmful content.
- Mitigation: When a human moderator acts on an escalated item, the decision is posted to a Kafka
moderator-decisionstopic. A background service copies this pair (Raw Media + Label) to an offline Hadoop/Spark active learning data lake. Every night, the platform runs fine-tuning pipelines to retrain our edge classifier models, continuously adapting to new evasion techniques.
Trade-offs & Architectural Decisions
1. Moderation Model: Synchronous Blocking vs. Asynchronous Pipeline
- Synchronous Blocking: The user's post is held back from the platform until the entire ML pipeline completes.
- Pros: Guarantees zero exposure of bad content to other users.
- Cons: Severe degradation of user experience. Uploading a photo takes 2 seconds, which kills app responsiveness.
- Asynchronous Pipeline (Selected): The post is published immediately. The moderation pipeline runs in the background.
- Pros: Sub-50ms instant UI upload responsiveness.
- Cons: Creates a sub-second exposure window. We mitigate this using a Shadowban State: the post is visible only to the creator's profile page until the asynchronous 2-second pipeline clears it for public distribution.
2. Matching Engine: Exact Cryptographic vs. Fuzzy Perceptual Caching
- Exact Hash (MD5/SHA256):
- Pros: Microsecond lookup speeds. Zero false positives.
- Cons: Easily bypassed by changing a single pixel or minor cropping.
- Perceptual Caching (Selected):
- Pros: Catches visually modified copies of violating media, saving millions of dollars in GPU machine learning costs.
- Cons: Prone to small false positive rates. We mitigate this by sending close pHash matches (Hamming distance between 6 and 10) to a fast human verification queue rather than deleting them instantly.
Failure Scenarios & Mitigation Strategies
1. Breaking News Volumetric Spike
During a major geopolitical breaking news event, global media uploads spike to 3x normal levels. The GPU inference queues saturate, causing pipeline delay to grow from 2 seconds to 30 minutes.
- Mitigation: We implement Degraded Dynamic Backpressure:
- When the GPU processing queue latency exceeds 10 seconds, we automatically bypass heavy multi-label models and fallback strictly to lightweight, fast-tier models.
- We lower the threshold for shadowbanning: any content with even a mild risk score (above 0.3) is shadowbanned until the queue latency falls back below standard levels.
2. Adversarial Dataset Poisoning Attack
Coordinated bad actors target the feedback loop. They intentionally flag safe content or upload bad content and mark it as safe using compromised moderator accounts to poison the training set.
- Mitigation: We enforce Inter-Annotator Agreement (Consensus). No single moderator decision can trigger model retraining. An item is only labeled as a training asset if it receives identical classification from 3 independent, randomly allocated human reviewers.
Staff Engineer Perspective
Operating a multi-million dollar machine learning inference farm requires deep systems engineering optimization.
Candidate Verbal Script & Mock Interview Guide
Here is a step-by-step walkthrough of how to articulate this design during an actual System Design interview.
1. Requirements & Scaling Limits (Minutes 0 - 5)
- Candidate: "To design a content moderation system at Meta/TikTok scale, I will clarify scope. We must process text, images, and videos. Our non-functional goal is a massive scale of 100,000 uploads per second. We must achieve sub-second async processing latencies while ensuring that high-risk viral violations reach human review in under 5 minutes. Precision must remain exceptionally high to avoid safe content deletion."
2. Tiered Pipeline HLD (Minutes 5 - 15)
- Candidate: "I will design an event-driven tiered streaming pipeline. First, I will decouple the ingestion layer from the classification layer. The API gateway will write the raw media to S3 and post a task to an Apache Kafka cluster. We will process the media in three tiers:
- A fast synchronous filter utilizing MD5 exact hashes and 64-bit perceptual hashes (pHash) to catch known violations in under 50ms.
- An asynchronous GPU-based ML inference router executing dedicated NLP and computer vision models in parallel under 2 seconds.
- A human-in-the-loop review queue for uncertain cases. I will draw a comprehensive architecture map illustrating this pipeline."
3. Perceptual Hashing & GPU Optimization (Minutes 15 - 25)
- Candidate: "Running deep learning on every upload is economically unviable. I will implement perceptual hashing. Unlike cryptographic hashes, pHash focuses on visual structures. By computing the Hamming distance via XOR operations between the uploaded image pHash and our banned database, we can catch visually resized, cropped, or recolored files on CPU. For video, I will implement keyframe sampling (1 frame/sec) and isolate the audio track to run Whispers speech-to-text, letting us catch bad videos using text models first."
4. Virality-Weighted Human Queueing (Minutes 25 - 40)
- Candidate: "If we have millions of flagged posts, our human review queue will clog. To protect the platform, I will design a virality-weighted priority queue. I will calculate a priority score based on risk multiplied by log of real-time view velocity. I will stream real-time view statistics from our analytics engine into Elasticsearch. High-traffic posts are boosted instantly to the front of the queue, guaranteeing that a viral violation is moderated by a human within 5 minutes. The human decision triggers a webhook, enforcing the action and feeding the active learning model retraining loop."