I have several sites that are particularly large, as in 10.000 posts and more, so there is quite some fodder for search engines. In the access logs I have a ton of them, googlebot fetches a new page every few seconds and so does bing, yahoo only a few hundred per day but yandex and baidu are all over my access logs, too.
Since I’m only looking for US traffic all those foreign search engines are pretty useless to me, yet eat up valuable resources. But those big ones are already covered in that htaccess bot blocking code I posted earlier, I was just wondering if there was a way to only allow google and humans because at the end of the day I don’t want to have to check my access logs every day and see if there are any new offenders I might have to add. But looks like that’s the route I have to take.
Since implementing that code on the site about 2 hours ago, visits per hour have already been cut in half so looks like it’s working!