System Design Cases10 lessons22 quiz questions

Design: Chat System (WhatsApp/Slack)

10-session plan to master chat system design covering real-time protocols, scale, and advanced features.

What You Will Learn

  • Requirements & Protocol Choice
  • Capacity Estimation
  • Single-Server Chat Architecture
  • Multi-Server Scaling
  • Message Persistence & Cassandra
  • Group Chat & Fan-out
  • Presence System
  • Media & Notifications
  • Message Delivery Guarantees
  • Mock Interview

Overview

10-session plan to master chat system design covering real-time protocols, scale, and advanced features. Session 1: Requirements & Protocol Choice Interview Opening Interviewer: "Design a chat system like WhatsApp or Slack." Candidate: "Let me clarify scope. Is this 1-on-1 chat, group chat, or both?" Interviewer: "Both 1-on-1 and group chat." Candidate: "Group size limit?" Interviewer: "Up to 500 members per group." Candidate: "Real-time delivery requirement? If receiver is offline, do messages queue?" Interviewer: "Yes — offline users receive messages when they come back online." Candidate: "Media support — text only, or images/video/files too?" Interviewer: "Text and images. No video calls." Candidate: "Scale — DAU?" Interviewer: "50 million DAU." Functional Requirements 1-on-1 messaging with real-time delivery Group messaging (up to 500 members) Message persistence — offline users receive queued messages on reconnect Read receipts: sent, delivered, read indicators Online presence: show when users were last online Image sharing (up to 10MB per image) Message history: scroll back up to 1 year Non-Functional Requirements Message delivery latency: <500ms when both parties online High availability: 99.99% (messages must never be lost) Message ordering: guaranteed per conversation Scalability: 50M DAU, 100M messages/day Eventual consistency acceptable for presence information Protocol Decision: WebSocket vs. Polling vs. SSE Pros Simple, works everywhere Lower latency than polling True bidirectional, real-time Simple, HTTP-based Decision: WebSocket for active chat clients. HTTP long polling as fallback for environments that block WebSocket (some corporate firewalls). WebSocket handshake: upgrades from HTTP to persistent TCP connection. Server can push messages to client without client polling. One connection per device — a user with phone desktop = 2 connections. Connection Management Design Each Chat Server maintains WebSocket connections for a set of users. A connection registry (Redis) maps so any server can route messages to the right Chat Server. Interview Q&A Q: Why not use HTTP/2 server push instead of WebSocket? A: HTTP/2 server push is designed for pushing page assets, not arbitrary messages. WebSocket is purpose-built for persistent bidirectional communication with low overhead (2-byte frame headers vs full HTTP headers on each message). Q: How many WebSocket connections can one server hold? A: A modern server (8 vCPU, 32GB RAM) can hold ~100,000 simultaneous WebSocket connections. At 50M DAU with 20% concurrently active = 10M connections → needs 100 Chat Servers. This is the primary scaling challenge. Java Implementation Python Implementation

Sample Quiz Questions

1. Why is WebSocket preferred over HTTP for a chat application?

·Difficulty: easy/5

2. Alice is connected to Chat Server 1. Bob is connected to Chat Server 2. Alice sends Bob a message. How is it delivered?

·Difficulty: medium/5

3. Why is Cassandra chosen for message storage over PostgreSQL?

·Difficulty: medium/5

+ 19 more questions available in the full app.

Related Topics

Master Design: Chat System (WhatsApp/Slack) for Your Next Interview

Get access to full lessons, adaptive quizzes, cheat sheets, code playground, and progress tracking — completely free.