HardDesign: Web Crawler
Which partitioning strategy for the URL frontier prevents a single fast-to-crawl domain from monopolizing crawlers?
— Tests your understanding of this concept.
Answer Options
APartition by URL hash
BOne queue per domain with per-domain rate limiting
CPriority queue sorted by crawl time
DRound-robin assignment
Want to see the correct answer?
Get the answer with a detailed explanation, plus practice 22+ more Design: Web Crawler questions with adaptive quizzes and timed interviews.
See the Answer on Guru Sishya →This question is from the Design: Web Crawler topic (System Design Cases).
More Design: Web Crawler Questions
Why is BFS (Breadth-First Search) preferred over DFS for web crawling?
HardWhat is the purpose of a Bloom filter in a web crawler?
HardA website generates infinite unique URLs like /products?page=1, /products?page=2 ... /products?page=1000000. How do you handle this?
HardWhat does robots.txt's 'Crawl-delay: 10' directive mean?
HardHow do you crawl JavaScript-rendered Single Page Applications (SPAs)?
Hard