HardDesign: Web Crawler

A fetcher node crashes after popping a URL from the queue but before completing the crawl. How do you prevent URL loss?

Tests your understanding of this concept.

Answer Options

AThe URL is lost permanently
BUse in-flight tracking: mark URL as 'crawling' with TTL; if not completed in 5 minutes, return to queue
CUse a transaction log on the fetcher
DDuplicate all URLs in two queues

Want to see the correct answer?

Get the answer with a detailed explanation, plus practice 22+ more Design: Web Crawler questions with adaptive quizzes and timed interviews.

See the Answer on Guru Sishya →

This question is from the Design: Web Crawler topic (System Design Cases).

More Design: Web Crawler Questions