HardDesign: Web Crawler
What does robots.txt's 'Crawl-delay: 10' directive mean?
— Tests your understanding of this concept.
Answer Options
AMaximum 10 concurrent connections to this domain
BWait 10 seconds between consecutive requests to this domain
COnly crawl for 10 minutes per day
DCrawl only 10 pages from this domain
Want to see the correct answer?
Get the answer with a detailed explanation, plus practice 22+ more Design: Web Crawler questions with adaptive quizzes and timed interviews.
See the Answer on Guru Sishya →This question is from the Design: Web Crawler topic (System Design Cases).
More Design: Web Crawler Questions
Why is BFS (Breadth-First Search) preferred over DFS for web crawling?
HardWhat is the purpose of a Bloom filter in a web crawler?
HardA website generates infinite unique URLs like /products?page=1, /products?page=2 ... /products?page=1000000. How do you handle this?
HardHow do you crawl JavaScript-rendered Single Page Applications (SPAs)?
HardA fetcher node crashes after popping a URL from the queue but before completing the crawl. How do you prevent URL loss?
Hard