Yahoo Search Blog has a great posting on how the Slurp Bot tries to conserve bandwidth by making use of compression and cache-control headers. [here]
As a Web performance fanatic, it is heartening to see that these folks have taken such care, and put such thought into their indexing crawler. They want it to be accurate, but they don’t want to slam your site.
A while back, I had to write a robots.txt file for WebPerformance to keep the MSNBot from stomping the site on a daily basis. This site uses frames and query variables to produce the various performance graphs. Well, the MSNBot was indexing every page and every variation almost daily. Finally, I said go away, just to that crawler. All the others are fine. Maybe MSN Search should take a page from the Yahoo! (Inktomi) Bot development team.
2005-02-12 — 16:21
Yahoo’s search engine HTTP advises are of high interestYahoo! Search give recently in their blog some known advises, but worth recalling again and again for optimizing bandwidth needs of websites and accelerating them as a side effect.
Basically, they advertize that their robot/crawler/spider (i.e. the s
2005-02-12 — 16:21
Yahoo’s search engine HTTP advises are of high interestYahoo! Search give recently in their blog some known advises, but worth recalling again and again for optimizing bandwidth needs of websites and accelerating them as a side effect. Basically, they advertize that their robot/crawler/spider (i.e. the s