Skip to content

Block bad, possibly even malicious web crawlers (automated bots) using htaccess.

Notifications You must be signed in to change notification settings

chrisfu/bad-bot-blocker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 

Repository files navigation

Bad Bot Blocker

223 Apache htaccess rules to block bad bots.

Bad bots are defined as:

  • E-mail harvesters
  • Content scrapers
  • Spam bots
  • Vulnerability scanners
  • Aggressive bots that provide little value
  • Bots linked to viruses or malware
  • Government surveillance bots
  • Russian search engine Yandex
  • Chinese search engine Baidu

Yandex/Baidu

Unless your website is written in Russian or Chinese, you probably don't get any traffic from them. They mostly just waste bandwidth and consume resources.

Bots Are Liars

Bots try to make themselves look like other software by disguising their useragent. Their useragents may look harmless, perfectly legitimate even. For example, "^Java" but according to Project Honeypot, it's actually one of the most dangerous.

Setup

If you have a bizarre or complicated setup, be sure to look everything over before using it. But for anyone with something that resembles a standard Apache installation, this should work without any issues.

About

Block bad, possibly even malicious web crawlers (automated bots) using htaccess.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published