FAANG ReadyAll 23 GoF PatternsVisual Learning
Master System Design
Visually
The most interactive LLD & HLD learning platform. Understand every concept through animations, not textbooks.

Request Journey
🧑
Client
🌐
DNS
⚖️
Load Balancer
🔧
API Gateway
Service
💾
Cache
🗄️
Database
💡 Every web request travels this path. Understanding each hop is the foundation of system design.
AI Architecture Generator

Describe any system and get an interactive architecture diagram with analysis.

Traffic Simulator

Visualize 100 to 10M RPS flowing through your architecture in real time.

Chaos Engineering Lab

Inject failures — server crash, DB outage, network partition. Build resilience.

CAP Theorem Lab

Simulate network partitions. See how Cassandra vs MongoDB behave in real time.

Mock Interview Coach

Timed FAANG interviews with checklist evaluation and detailed feedback.

Learning Path

Structured curriculum: Beginner → Intermediate → Advanced → Expert.

LLD Fundamentals
Object-Oriented Programming — the foundation of clean low-level design
🔒 Encapsulation

Bundle data and methods that operate on that data within a single unit. Restrict direct access to internal state using access modifiers.

Think of a capsule — what's inside is hidden. Only defined openings (getters/setters) allow controlled access.

class BankAccount {
  // private — hidden from outside
  private balance: number = 0;
  private pin: string;

  // public getters — controlled access
  getBalance(pin: string): number {
    if(pin !== this.pin) throw "Invalid PIN";
    return this.balance;
  }
}
BankAccount Object balance 🔒 pin 🔒 getBalance() ✅ deposit() ✅ PRIVATE PUBLIC Caller via method
🎭 Abstraction

Hide implementation details, expose only what's necessary. Define a contract through abstract classes or interfaces.

Like a car's steering wheel — you don't need to know how the engine works to drive.

abstract class Animal {
  // abstract = must implement
  abstract sound(): string;

  // concrete method
  breathe() { return "inhale/exhale" }
}

class Dog extends Animal {
  sound() { return "Woof!" }
}
«abstract» Animal Dog Cat Bird sound()→"Woof" sound()→"Meow" sound()→"Tweet" Each subclass MUST implement sound()
🌳 Inheritance

A child class inherits properties and methods from a parent class, enabling code reuse and establishing an IS-A relationship.

A Car IS-A Vehicle. A SavingsAccount IS-A BankAccount. Inheritance models real-world hierarchies.

class Vehicle {
  start() { return "Engine on" }
  stop() { return "Engine off" }
}

class Car extends Vehicle {
  openTrunk() { ... } // new method
}

class Truck extends Vehicle {
  loadCargo() { ... }
}
Vehicle Car Truck Bike Sedan SUV ↑ inherits start(), stop() from Vehicle
🔄 Polymorphism

Same interface, different implementations. One method name behaves differently based on the object type at runtime.

Runtime polymorphism: Method dispatch decided at runtime via virtual table lookup.

interface Shape {
  draw(): void;
}

class Circle implements Shape {
  draw() { /* draws circle */ }
}

class Rect implements Shape {
  draw() { /* draws rectangle */ }
}

// Same call, different behavior:
shapes.forEach(s => s.draw());
shape.draw() Circle Rect Triangle Runtime method dispatch (vtable) Same method name → different execution

UML Class Relationships

Understanding relationship notation is critical for drawing class diagrams in interviews.

Animal Dog Inheritance (IS-A) hollow triangle Student Course enrolled Association uses/knows about Team Player Aggregation (HAS-A) hollow diamond House Room Composition filled diamond
SOLID Principles
Five principles for writing maintainable, scalable object-oriented code
S — Single Responsibility Principle
A class should have only one reason to change. If a class does too much, modifying one responsibility risks breaking another.
❌ Violation
class Invoice {
  calculateTotal() { ... }
  printInvoice() { ... } // ← wrong!
  saveToDatabase() { ... } // ← wrong!
}
// 3 reasons to change = 3 responsibilities
✅ Correct
class Invoice { calculateTotal() }
class InvoicePrinter { print(invoice) }
class InvoiceRepository { save(invoice) }
// Each class has 1 reason to change
Invoice calculateTotal() printInvoice() ⚠️ saveToDatabase() ⚠️ 3 responsibilities refactor Invoice calculateTotal() InvoicePrinter print(invoice) InvoiceRepo save(invoice) ✅ 1 responsibility each Changing print logic won't affect DB logic
O — Open/Closed Principle
Open for extension, Closed for modification. Add new behavior by creating new classes, not by modifying existing ones.
❌ Violation — modifying PaymentProcessor for each payment
class PaymentProcessor {
  pay(type: string) {
    if(type === "card") { ... }
    if(type === "upi") { ... }
    // add new type = MODIFY class ❌
  }
}
✅ Extend without modifying
interface PaymentMethod { pay() }
class CardPayment implements PaymentMethod {
  pay() { /* card logic */ }
}
class UPIPayment implements PaymentMethod {
  pay() { /* upi logic */ }
}
// Adding Crypto? New class, nothing modified!
L — Liskov Substitution Principle
Subtypes must be substitutable for their base types. If S is a subtype of T, using S anywhere T is used should not break the program.
❌ Violation — Penguin IS-A Bird but can't fly
class Bird {
  fly() { /* all birds fly? */ }
}

class Penguin extends Bird {
  fly() { throw "I can't fly!" } // ❌
}
✅ Model behavior correctly
interface Bird { eat(); walk() }
interface FlyingBird extends Bird { fly() }

class Eagle implements FlyingBird { ... }
class Penguin implements Bird { ... }
// No violations — penguins don't fly
I — Interface Segregation Principle
No client should be forced to depend on methods it does not use. Split large interfaces into smaller, specific ones.
❌ Fat interface
interface Worker {
  work();
  eat(); // robots don't eat!
  sleep(); // robots don't sleep!
}
class Robot implements Worker {
  eat() { /* forced empty impl */ }
  sleep(){ /* forced empty impl */ }
}
✅ Segregated interfaces
interface Workable { work() }
interface Feedable { eat() }
interface Sleepable { sleep() }

class Human implements Workable,Feedable,Sleepable{}
class Robot implements Workable {}
D — Dependency Inversion Principle
High-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend on details.
❌ Direct dependency on low-level
class Notification {
  email = new EmailService(); // tightly coupled!
  notify() { this.email.send() }
}
✅ Depend on abstraction
interface Notifier { send() }
class Email implements Notifier { send() {} }
class SMS implements Notifier { send() {} }

class Notification {
  constructor(private n: Notifier) {}
  notify() { this.n.send() }
}
Notification (High-level) «Notifier» send(): void Email SMS Push depends on implements
Design Patterns
All 23 Gang of Four patterns — interactive animations and code examples
LLD Advanced
Concurrency, API design, database normalization, and error handling
Concurrency & Threads

Multiple threads executing simultaneously can lead to race conditions when they share mutable state without synchronization.

Race Condition: Two threads read a value, both increment, both write — one increment is lost!
Deadlock: Thread A holds Lock 1, waits for Lock 2. Thread B holds Lock 2, waits for Lock 1. Both wait forever.
// Race condition example
let counter = 0;

// Thread A & B simultaneously:
temp = counter; // both read 0
temp = temp + 1; // both compute 1
counter = temp; // both write 1 ← lost update!

// Fix: synchronized / mutex
synchronized(this) { counter++; }
counter=0 Thread A Thread B read: 0 read: 0 +1 = 1 +1 = 1 Expected: 2 Actual: 1 ← Race Condition! Fix: use mutex/lock
Database Normalization

1NF — First Normal Form

Eliminate repeating groups; all values must be atomic.

-- Violation (multi-value in cell)
Orders(id, items: "pen,book,ruler")

-- 1NF: one value per cell
Orders(id, item) ← separate rows

2NF — Second Normal Form

1NF + no partial dependencies on composite key.

-- Before: StudentCourse(sid, cid, sName)
-- sName depends only on sid, not (sid,cid)

-- 2NF: separate tables
Student(sid, sName)
Enrollment(sid, cid)

3NF — Third Normal Form

2NF + no transitive dependencies.

-- Employee(empId, deptId, deptName)
-- deptName depends on deptId, not empId

-- 3NF:
Employee(empId, deptId)
Department(deptId, deptName)
Before (Unnormalized) id | items 1 | pen,book ↑ Not atomic! 1NF (Atomic values) id | item 1 | pen 1 | book ✅ One value per cell 2NF — Remove Partial Dependencies Student(sid, sName) Enrollment(sid, cid) sName depends only on sid (partial dep removed) 3NF — Remove Transitive Dependencies Emp(empId, deptId) Dept(deptId, deptName) deptName depends on deptId not empId (transitive removed)
REST API Design

REST (Representational State Transfer) uses HTTP methods to perform CRUD operations on resources identified by URIs.

MethodActionIdempotent
GETRead resource✅ Yes
POSTCreate resource❌ No
PUTReplace resource✅ Yes
PATCHPartial update⚠️ Maybe
DELETEDelete resource✅ Yes
# REST resource: /users/{id}/orders

GET /users/42/orders ← list
POST /users/42/orders ← create
GET /users/42/orders/7 ← get one
PUT /users/42/orders/7 ← replace
PATCH /users/42/orders/7 ← update
DELETE /users/42/orders/7 ← delete

# Status codes:
200 OK 201 Created 204 No Content
400 Bad Request 401 Unauthorized
404 Not Found 429 Rate Limited
500 Server Error 503 Unavailable
Error Handling Strategies
// Try-Catch-Finally pattern
try {
  const data = await fetchData(url);
  process(data);
} catch(e: NetworkError) {
  retry(url, maxRetries=3);
} catch(e: ParseError) {
  logError(e); returnDefault();
} finally {
  releaseConnection(); // always runs
}

// Circuit Breaker Pattern
if(circuitBreaker.isOpen()) {
  return cachedFallback;
}
try block Error? YES catch block NO finally block Always executes — cleanup resources
High-Level Design
System architecture, scalability, reliability, and CAP theorem
Architecture Evolution
Monolithic Architecture — all components (UI, business logic, data access) live in ONE deployable unit. The entire application ships as a single artifact: one build, one deploy, one process.
Simple deployment Easy local dev Simple debugging Low latency (in-process) Hard to scale selectively Tech stack lock-in Long build/test cycles Deploy risk — all or nothing

When to use a Monolith

Early-stage startups — move fast, validate product first
Small teams (<10 engineers) — microservices overhead not worth it
Well-understood domain where service boundaries aren't clear yet
Avoid when components need very different scaling profiles
Avoid when teams need to deploy independently at high velocity
💡 Real world: Instagram, Shopify, and Stack Overflow ran monoliths for years at massive scale. The "Majestic Monolith" is a valid, often overlooked pattern. Don't rush to microservices.
Client Monolith Application Presentation Layer React / Angular / Templates Controller Layer Routes / REST Handlers Business Logic Services / Domain Objects Data Access Layer ORM / Repository Pattern 🗄 Database ONE DEPLOY
// All called in-process — no network hops
class UserController {
  async register(req, res) {
    const user = await this.db.save(req.body);
    await this.emailSvc.sendWelcome(user);
    res.json(user); // single process
  }
}
Layered (N-Tier) Architecture — strict separation into horizontal layers. Each layer ONLY communicates with the adjacent layer directly below it. Classic enterprise pattern — MVC is a variant.
Clear separation of concerns Testable each layer in isolation Well-known pattern Sinkhole anti-pattern risk Rigid layer dependency Always traverses all layers
① Presentation Layer

HTTP request/response, input validation, auth. No business logic. Controllers, REST handlers, GraphQL resolvers.

② Business Logic Layer

Core application rules, workflows. No knowledge of HTTP or DB. Pure functions — the heart of your app. Services, Use Cases, Domain objects.

③ Data Access Layer (DAL)

Abstracts storage behind Repositories/DAOs. Business layer doesn't know if you use MySQL or MongoDB. Enables zero-change swaps of storage backends.

④ Database Layer

Actual persistence — SQL, NoSQL, Cache. Transactions, indexes, and query optimization live here.

Client Browser Presentation Controllers Middleware Express/FastAPI Business Logic Services Domain Objects Pure Functions Data Access Repositories DAOs / ORM TypeORM/SQLAlchemy Database PostgreSQL/MySQL Strict one-way dependency: each layer only talks to the layer below it
// Each layer only knows the layer directly below
class OrderController { // Presentation
  async create(req, res) {
    return res.json(await this.orderService.create(req.body));
  } // ← no DB knowledge here
}
class OrderService { // Business
  async create(data) {
    await this.payGateway.charge(data); // rule
    return await this.orderRepo.save(data); // via DAL
  }
}
Microservices Architecture — each service is independently deployable, owns its own database, has a single business responsibility, and communicates via REST/gRPC (sync) or message queues (async). The dominant pattern for large-scale internet systems.
Independent deployment Polyglot tech stacks Selective scaling Fault isolation Team autonomy Distributed complexity Network latency Data consistency challenges High operational overhead
Client Browser API Gateway Auth, Rate-limit Routing, SSL Term. User Service Node.js — Auth, Profile Order Service Go — Cart, Checkout Payment Service Java — Stripe, Billing UserDB PostgreSQL OrderDB MongoDB PayDB MySQL Kafka Event Bus Notification Service Python ── Sync REST/gRPC Async Events

Communication Patterns

Synchronous — REST / gRPC

Direct request-response. Caller waits for reply. gRPC is faster (Protocol Buffers). REST is simpler (JSON). Risk: cascading failures — use circuit breakers.

Asynchronous — Events (Kafka/SQS)

Fire and forget. Best for notifications, analytics, billing. Decouples services, better fault tolerance, eventual consistency. Dead letter queues for failures.

Decomposition Strategies

By Business Capability — User, Order, Payment, Notification. Most common.
By Domain (DDD Bounded Contexts) — Aligns service boundaries with team ownership.
Strangler Fig Pattern — Incrementally extract from monolith. Lowest risk migration path.
Conway's Law: "Systems mirror the communication structure of the organization." Design your team structure first, then your service boundaries.

Non-Functional Requirements (NFRs)

NFRs define how well the system performs — not what it does. They're the quality attributes that separate a production system from a toy. In FAANG interviews, always clarify NFRs before designing.

Performance / Latency

How fast does the system respond? Always ask about percentile SLOs: p50/p95/p99. Tail latency matters — the 1% of slow requests affect real users.

p50: 12ms (median user)
p95: 45ms (most users)
p99: 180ms (1 in 100)
p999: 800ms (1 in 1000)
💡 Google found 500ms added latency reduced searches by 20%. Amazon: 100ms = 1% revenue loss.
🟢
Availability / SLA

% of time the system is operational and serving requests. Each additional "9" = 10× less downtime. Most FAANG systems target 99.99%+.

99% = 87.6 hrs/yr down
99.9% = 8.76 hrs/yr down
99.99% = 52.6 min/yr down
99.999%= 5.26 min/yr down
SLI (metric) → SLO (target) → SLA (contract with penalties)
📈
Scalability / Throughput

Ability to handle growing load without degradation. Throughput = requests per second (RPS). Design for 3× peak load.

Twitter: 350K RPS reads
Netflix: 8M concurrent users
Google: 40K searches/sec
WhatsApp: 1.15M msg/sec
🛡
Fault Tolerance

System continues operating correctly despite component failures. MTBF (Mean Time Between Failures) and MTTR (Mean Time To Recovery) are key engineering metrics.

Graceful degradation (serve stale data > error)
Circuit breakers (fail fast, don't cascade)
Bulkhead isolation (contain failures)
Retry with exponential backoff + jitter
🔒
Security

CIA Triad: Confidentiality, Integrity, Availability. Defense-in-depth — multiple layers of controls.

TLS 1.3 for all data in transit
AES-256 encryption at rest
OAuth2/JWT for authentication
Zero-trust network architecture
Rate limiting at API gateway
🌍
Durability & Data Consistency

Data must survive failures. Design RPO (data loss tolerance) and RTO (recovery time). Replication factor ≥ 3 for critical data.

Write-ahead logs (WAL) for crash recovery
Synchronous replication for strong durability
Cross-region backups for disaster recovery

Scaling Strategies

Vertical Scaling (Scale Up)

Add more CPU/RAM/Disk to a single server. Zero code changes. Simple but has hard limits (largest AWS instance is 448 vCPUs, 24TB RAM) and creates a single point of failure.

4 vCPU 8GB RAM t3.large 32 vCPU 256GB RAM r6i.8xlarge ~$2,000/mo ⚠ Ceiling ~448 cores max
Zero code change Simple ops SPOF Cost grows exponentially Hard ceiling
Horizontal Scaling (Scale Out)

Add more commodity servers behind a load balancer. Requires stateless services (store session in Redis, not in-memory). How all internet-scale systems work — cloud auto-scaling groups.

Server 1 Server 2 Server 3 +Auto Scale Load Balancer HAProxy/ALB ✅ Unlimited scale — cloud auto-scaling on CPU/RPS thresholds
Unlimited scale No SPOF Linear cost Requires stateless services

Load Balancing Algorithms

Round Robin

Distributes requests sequentially: 1→2→3→1→2→3. Simple, zero overhead. Works well when all servers have equal capacity and requests have uniform cost.

Req 1 → Server A
Req 2 → Server B
Req 3 → Server C
Req 4 → Server A ↺
SimpleIgnores server load
Least Connections

Route to server with fewest active connections. Dynamically adapts to variable request duration. Best for WebSocket, streaming, or long-running DB queries.

Server A: 120 conns
Server B: 45 conns ← next
Server C: 98 conns
→ Route to B (fewest)
AdaptiveHandles slow requests
IP Hash / Sticky Sessions

hash(clientIP) % N → always same server for same client. Maintains session affinity. Avoid this — store sessions in Redis instead for true stateless scaling.

hash("10.0.1.4") = 4821
4821 % 3 = 2 → Server B

Same IP always → Server B
Session affinityUneven distribution
CAP Theorem (Brewer, 2000) — In any distributed system, you can only guarantee 2 of 3 properties: Consistency, Availability, and Partition Tolerance. Since network partitions are inevitable in real distributed systems, the real choice is between CP or AP.
C Consistency A Availability P Partition CP MongoDB HBase, ZooKeeper AP Cassandra DynamoDB, CouchDB CA Single-node DB only
Consistency

Every read receives the most recent write — or returns an error. All nodes see identical data at any moment. Required for banking, inventory, distributed locks.

Availability

Every request receives a non-error response — though it may not be the latest data. System stays online even during partial failures. Required for social feeds, product catalogs.

Partition Tolerance

System operates despite network splits between nodes. Not optional — network partitions are a physical reality. Every distributed system must be partition tolerant.

PACELC Extension: Even without partitions (E), you still choose between Latency (L) and Consistency (C). A more complete model: CP with low L (Dynamo in strong mode), or AP with low L (Cassandra).
Consistency Availability
Balanced — DynamoDB eventual consistency mode
These patterns solve recurring architectural challenges at system scale. Understanding them is critical for FAANG system design interviews. Each solves a specific class of problem.
Event-Driven Architecture

Components communicate via events published to a message bus (Kafka, EventBridge, SNS). Publishers don't know about subscribers. Excellent for async workflows and high throughput.

OrderService.publish("order.created")
→ InventoryService subscribes
→ NotificationService subscribes
→ AnalyticsService subscribes
Loose couplingHigh throughputEventually consistent
CQRS (Command Query Separation)

Separate write model (Commands) from read model (Queries). Optimize reads independently from writes. Write side emits events; read side builds materialized views.

Write: POST /orders → OrderDB
→ event → rebuild
Read: GET /orders/feed
→ ElasticSearch view
Read/write optimizedExtra complexity
Event Sourcing

Never store current state directly — store the sequence of events that led to the current state. Rebuild any past state by replaying events. Complete audit trail, temporal queries.

Events: [AccountOpened, Deposited,
Withdrawn, TransferSent]
Balance = replay all events
→ perfect audit trail
Full audit trailTime travelEventual consistency
Saga Pattern

Manage distributed transactions across microservices using a sequence of local transactions + compensating transactions on failure. No distributed 2PC needed.

Order → reserve inventory
→ charge payment
→ confirm shipping
Failure: compensate backwards
No 2PC neededComplex rollback
Backend For Frontend (BFF)

A dedicated API layer tailored to each frontend client (Web BFF, Mobile BFF, TV BFF). Aggregates backend services and shapes responses for each client's specific needs.

Mobile BFF → lighter payloads
Web BFF → richer data
TV BFF → streaming data
Each team owns their BFF
Client-optimizedTeam autonomy
Strangler Fig Pattern

Incrementally replace a monolith by building new features as microservices and routing traffic. Named after the fig tree that grows around a host tree. Zero-downtime migration strategy.

Phase 1: Monolith handles all
Phase 2: /auth → Auth Service
Phase 3: /orders → Order Svc
Phase 4: Monolith retired
Low risk migrationZero downtime
Infrastructure Components
The building blocks that make production systems reliable and fast
Load Balancer — L4 vs L7

Load balancers distribute traffic across multiple servers to prevent overload and ensure high availability.

FeatureL4 (Transport)L7 (Application)
LayerTCP/UDPHTTP/HTTPS
RoutingIP + PortURL, Headers, Cookies
SpeedVery FastSlower
SSLNoYes (SSL termination)
ExamplesAWS NLB, HAProxyAWS ALB, Nginx
Client1 Client2 Client3 Load Balancer ⚖️ Round Robin Server 1 Server 2 Server 3
Redis Caching

Cache stores frequently accessed data in memory, reducing database load and improving latency dramatically.

Cache Hit: Data found in cache → return immediately (~1ms)
Cache Miss: Not in cache → query DB → store in cache → return

// Cache-Aside (Lazy Loading) pattern
async function getUser(id) {
  const cached = await redis.get(`user:${id}`);
  if(cached) return JSON.parse(cached); // HIT

  // MISS — fetch from DB
  const user = await db.findUser(id);
  await redis.setex(`user:${id}`, 3600, JSON.stringify(user));
  return user;
}
Service Redis ~1ms Database ~100ms ✅ HIT → return fast MISS→DB store in cache Eviction: LRU | LFU | TTL expiry
Content Delivery Network

CDNs cache static content at edge servers geographically closer to users, reducing latency and origin server load.

Without CDN: User in India → Server in US = 200ms
With CDN: User in India → Edge server in Mumbai = 10ms
Static assets (images, CSS, JS) cached at edge
User request routed to nearest PoP (Point of Presence)
Cache miss → fetch from origin → cache at edge
Reduces DDoS impact by distributing traffic
Origin Server Edge US-East Edge EU Edge Asia 🇺🇸 User→10ms 🇩🇪 User→8ms 🇮🇳 User→12ms
Message Queues — Kafka & RabbitMQ
Message queues decouple producers and consumers, enabling async processing, buffering traffic spikes, and reliable delivery.
Producer1 Producer2 Topic / Queue Partition 0 — Offset 0,1,2,3,4... Consumer A Consumer B Consumer C Consumer Group — each message processed once per group Benefits: Decoupling Buffering Reliability Fan-out Replay
Circuit Breaker Pattern
Prevents cascading failures. When a service fails repeatedly, the circuit "opens" and immediately returns fallback instead of waiting for timeouts.
CLOSED Normal operation requests pass through OPEN Fail fast mode return fallback instantly HALF-OPEN Test recovery allow few requests failure threshold timeout expires success → close circuit failure → reopen
Distributed Systems
Sharding, replication, consistency, leader election, and observability
Database Sharding

Sharding splits data across multiple databases (shards). Each shard holds a subset of the data, enabling horizontal database scaling.

Sharding Strategies

Range-based: shard by ID ranges (1-1M, 1M-2M). Simple but can be uneven.
Hash-based: hash(user_id) % N shards. Even distribution but hard to range query.
Directory-based: lookup table maps keys to shards. Flexible but single point of failure.
Hotspot problem: Celebrity users (millions of followers) on one shard overwhelm it. Solution: add secondary shard or special-case hot keys.
Application Shard Router hash(key) % 3 Shard 0 users 0,3,6... 8M records Shard 1 users 1,4,7... 7.8M records Shard 2 users 2,5,8... 8.2M records
Database Replication

Replication copies data across multiple nodes for fault tolerance and read scaling.

Leader-Follower (Primary-Replica)

All writes go to Leader
Reads can be served by any follower
Leader sends replication log to followers
If leader fails, a follower is promoted

Synchronous: Leader waits for follower ack. Slow but no data loss.
Asynchronous: Leader doesn't wait. Fast but possible data loss on failover.
Write ✍️ Leader Primary All writes here Follower 1 Replica (async) Follower 2 Replica (sync) Read 📖 replicate Failover: Follower → Leader
Eventual Consistency

In distributed systems, achieving strong consistency has a high cost. Eventual consistency allows temporary divergence — all nodes will converge to the same state eventually.

Propagation delay: Write hits Node A → A syncs to B and C after some delay. During this window, reading from B returns stale data.

Consistency Models

Strong: All reads return latest write
Eventual: Will be consistent, but timing unknown
Monotonic Read: Never see older data after newer
Read-your-writes: Always see own writes
→ time → Write Node A Node B Node C v=1 ✅ v=1 ✅ v=1 ✅ v=0 ⏳ v=0 ⏳ v=1 ✅ v=0 ⏳ v=0 ⏳ v=0 ⏳ v=1 ✅ Eventually all nodes converge to v=1
Leader Election (Raft Simplified)
When a leader node fails, remaining nodes must elect a new leader. Raft uses randomized timeouts and majority votes.
Step 1: Normal Leader Node A B C heartbeats Step 2: Leader Fails FAILED Node A B C timeout → trigger election Step 3: New Leader A (dead) Leader Node B C B wins majority vote → new leader
Observability — Metrics, Logs, Traces
📊
Metrics

Numerical measurements over time. CPU%, request rate, error rate, p99 latency. Stored in time-series DB (Prometheus).

📋
Logs

Timestamped event records. Structured (JSON) or unstructured. ELK stack (Elasticsearch, Logstash, Kibana).

🔍
Traces

Track a request across multiple services. Jaeger, Zipkin. See full call graph and latency at each hop.

// OpenTelemetry trace example
const span = tracer.startSpan('processOrder');
span.setAttribute('order.id', orderId);
try {
  await paymentService.charge(order); // child span auto-created
  await inventoryService.reserve(order);
  span.setStatus({ code: SpanStatusCode.OK });
} finally {
  span.end(); // exports to Jaeger/Zipkin
}
Interview Playground
Design real systems — click a system to see full animated architecture
Step 1/5
🔗 URL Shortener
Scaling Decisions
Back-of-envelope
3D Architecture Lab
Explore system architectures in interactive 3D space — rotate, zoom, and click components
LAYERS:
Services
Data
Network
Infra
🖱 Drag to rotate  ·  Scroll to zoom  ·  Click node for info

Click a component

Rotate the scene and click any 3D node to see what it does, when to use it, and the key tradeoffs.

Navigation
🖱 Left drag — Rotate camera
🔲 Right drag — Pan view
Scroll — Zoom in/out
👆 Click node — Show component info
Color Legend
Network layer (Client, LB, CDN)
Service layer (API, Gateway)
Application layer (Microservices)
Data layer (DB, Cache)
Infrastructure (Queue, Storage)
Animated Flows

Glowing particles travel along connection edges showing live request paths. Thicker edges = higher traffic volume. Pulsing nodes = active processing.

Distributed Systems Simulator
Live simulation of distributed system behaviors — interact in real time
5
Active Nodes
0
Total Requests
0ms
Avg Latency
0
Errors
Consistent Hashing

A hashing technique that minimizes key remapping when nodes are added or removed. Each node owns a range on the hash ring. Virtual nodes (vnodes) ensure even distribution.

Event Log
Interview Practice Mode
Simulate real FAANG system design interviews with guided steps and AI evaluation
🐦
Design Twitter
🔗
URL Shortener
🎬
Design YouTube
🚗
Design Uber
💬
Design WhatsApp
🎞
Design Netflix
1.Requirements
2.Scale
3.Architecture
4.Components
5.Scaling
6.Failures
7.Evaluate
Architecture Builder
Drag, drop and connect components to design your own system architecture
TOOL:
Click palette to add · Connect tool to wire components

Network

🧑 Client
📱 Mobile App
🌐 CDN
⚖️ Load Balancer

Services

🔧 API Gateway
🖥 Web Server
⚡ Microservice
🔐 Auth Service

Data

🗄 Database
⚡ Cache (Redis)
🔍 Search (ES)
📦 Object Storage

Infra

📨 Message Queue
🌊 Stream Processor
📊 Monitor

Component Config

Select a component to configure its properties.
How to Use
1. Drag components from palette onto canvas
2. Select Connect tool, then drag between nodes
3. Click a node to configure it
4. Press Simulate to animate traffic flow
Tips
• Nodes snap to 40px grid automatically
• Double-click a node to rename it
• Connections show animated request flow
• Red glow = bottleneck in simulation mode
Architecture Stats
Nodes: 0  ·  Edges: 0
Add components to start building.

Architecture Advisor

Build an architecture to receive AI-powered suggestions.
Interview Question Bank
Real FAANG system design questions with animated solutions
FILTER:
Distributed Systems Game
Keep your system alive under increasing load — scale or fail!
0
Score
100%
Availability
12ms
P99 Latency
$1000
Budget

System Design Game

Keep your distributed system alive as traffic increases. React to failures by adding components!

Scoring
+1 per second alive
+10 per 1000 requests handled
-50 for each server crash
Bonus for high availability
Events
🔴 Server crash — reroutes traffic
⚡ Cache failure — DB load spikes
🌊 Traffic spike — 3× normal load
🔀 Network partition — split cluster
Win Condition
Survive 3 minutes with >95% availability. Latency must stay below 500ms. Achieve 10,000 score to win!
Architecture Whiteboard
Sketch system architectures freely — like Excalidraw for engineers
Keyboard Shortcuts
P Pen   L Line   A Arrow
R Rect   T Text   E Eraser
Ctrl+Z Undo   Del Clear selection
Shape Stamps

Click SVR/DB/CAC/QUE/LB/CLI toolbar buttons to stamp pre-drawn system components at cursor position. Then connect with Arrow tool.

Export

Click 💾 to export the whiteboard as PNG. Great for saving your architecture sketches to share with your team.

Performance Analyzer
Latency heatmaps, scalability analysis, and bottleneck detection for your architecture
20%
50%
2
Low
High
AI Architecture Generator
Describe any system — instantly generate an interactive architecture diagram
Quick prompts: YouTube WhatsApp URL Shortener Twitter Netflix Uber Google Search Distributed Cache
Architecture Analysis
Enter a system prompt and click Generate to see an interactive architecture diagram with detailed component analysis.
Real-Time Traffic Simulator
Visualize requests flowing through your architecture — watch bottlenecks emerge in real time
Traffic Rate 1,000 RPS
100 RPS10M RPS
0
Req/sec
0ms
Latency p99
0%
Error Rate
0
Throughput
Architecture Mode
Chaos Engineering Lab
Inject failures. Break things deliberately. Build resilient systems.
System Resilience 100%
Inject Failure
Mitigations
Event Log
CAP Theorem Playground
Simulate network partitions and see how real databases sacrifice Consistency vs Availability
Cassandra
Cassandra is an AP system. It prioritizes availability over consistency. During a network partition, all nodes remain available but may serve stale data.
Availability: High Consistency: Eventual Partition: Tolerant
Node Behavior During Partition
PACELC Trade-off
Consistency
Availability
AP System: Prefers availability over consistency
Global Traffic Map
Visualize CDN edge nodes, data centers, and live request routing across the world
0
Active Requests
6
Active Regions
12ms
Avg CDN Latency
Infrastructure Cost Estimator
Model your system, estimate monthly cloud spend across AWS, GCP, and Azure
System Parameters
Monthly Estimate $0
Mock Interview Coach
Timed system design sessions with real evaluation criteria
Junior

Basic system design. 30 min sessions. Graded on fundamentals.

Senior

Complex distributed systems. 45 min. Full FAANG criteria.

Staff/Principal

Org-level design, ambiguity, leadership. 60 min.

Learning Path
Structured curriculum from zero to distributed systems expert
Choose Your Track
Beginner
OOP + LLD fundamentals
40%
Intermediate
System design fundamentals
15%
Advanced
Distributed systems
0%
Expert
Large-scale architecture
0%
Curriculum — Beginner