Overview
CDN is the highest-leverage optimization for global applications. The mental model: instead of having all users travel to your data center, bring your data center to each user. Master Cache-Control headers and invalidation strategies — these are what determine CDN effectiveness. The Problem CDNs Solve A server in us-east-1 (Virginia) serves a user in Tokyo. That round-trip is ~14,000 km — adding 150–200 ms of latency before a single byte renders. Multiply this across millions of global users and two problems compound: high latency for distant users and crushing load on a single origin. A Content Delivery Network solves both simultaneously by caching content at geographically distributed Points of Presence (PoPs) — edge servers close to users. --The CDN Request Lifecycle Key response headers: — served from edge cache — edge fetched from upstream — seconds the object has been cached — Cloudflare equivalent --Cache-Control: The Core Directive overrides for shared caches (CDNs). Set and browsers cache 1 hour while the CDN refreshes every minute. --Anycast Routing CDNs advertise the same IP from every PoP via BGP Anycast. Internet routing automatically directs each user's packets to the topologically nearest PoP. A user in Frankfurt might hit the Amsterdam PoP depending on ISP peering — topology, not geography, decides. --Cache Hit Ratio At 95% CHR, your origin handles only 5% of traffic. On 1M req/day that's 50K origin hits instead of 1M — a 20x reduction. CHR killers: Short TTLs on rarely-changing content Unbounded query strings (each is a distinct cache key) (thousands of unique values = near-zero hit rate) Cookie-based cache key variation --How PoPs Are Structured Internally Each CDN Point of Presence is not a single server — it is a cluster of cache servers, sometimes called edge caches or cache nodes, operating together. When a request arrives at a PoP: Consistent hashing routes the request to a specific cache node based on the URL, ensuring the same URL always maps to the same node (maximising hit rate within the PoP). If that node misses, it may check a PoP-local second tier before going upstream. The response is written to the node's SSD/memory cache for future requests. Large CDNs have tiered architecture within each PoP as well — L1 (memory, tiny/fast) and L2 (SSD, larger/slightly slower) caches. This mirrors CPU cache hierarchy principles. --CloudFront Architecture Specifics AWS CloudFront is the CDN most commonly discussed in system design interviews because of its AWS ecosystem integration. Key CloudFront concepts: Distribution: Your CDN configuration — maps one or more domains to one or more origins Behaviour: A path-pattern rule (e.g., , , ) that defines cache policy, TTL, allowed methods, and which origin to use Cache Policy: Reusable TTL and cache key configuration attached to a behaviour Origin Groups: Failover groups — if primary origin returns 5xx, CloudFront automatically retries against a secondary origin Common distribution structure: Lambda@Edge: Run Node.js/Python at CloudFront PoPs with 4 trigger points — viewer request, origin request, origin response, viewer response.
Continue learning Content Delivery Networks (CDN) with full lessons, quizzes, and interactive exercises.
Continue Learning on Guru Sishya →