seo - robots.txt: disallow all but a select few, why not? -


I ask, Google, Microsoft, and Yahoo! I was thinking about rejecting every crawler for a while. From my site

The reason behind this is that I have never seen any traffic arising from any other web crawler.

My questions are:

  • Is there no reason?
  • Has anyone done this?
  • Have you seen any negative effect?
  • Update:
    So far I've used the Blacklist approach: If I do not like crawlers, I add them to the list of rejection .
    I do not have any fans to blacklist because this is a never ending story: there are always more crawlers.

    I am not so worried about the actual ugly abuse crawler, they have come to know and have been blocked automatically. (And they usually do not ask for robots.txt in any way)

    However, many crawlers are not actually misusing in any way, they are for any of my clients Do not even generate value.
    For example, there are some crawlers, who claim on the website that they will be the next Google; Only better, I have never seen any traffic coming from them and I am quite suspicious about them that it is better than the four search engines given above.

    Update 2:
    I have been analyzing traffic on several sites for a while, and it seems that for the proper small sites, 100 unique ones per day Human visitors (= visitors who I can not recognize as being human). About 52% of the traffic generated is through automated processes.

    60% of all automatic visitors are not reading robots.txt, 40% (21% of total traffic) requests robots.txt (ask, include google, microsoft, and yahoo!)

    So my thinking is, if I block all good behavior crawlers who do not generate any value for me, then I can reduce the bandwidth usage and server load by about 12% 17%

    The Internet is an publication mechanism if you want to whitelist your site You are against the grain, but it is fine.

    What do you want to whitelist your site?

    Keep in mind that bots neglected by robots have been very bad (obviously), and well behaved bots are probably for a good reason, it's just that opaque for you is.


    Comments