The great thing about running your own hardware is that you can do anything you want with it. I don’t support any customers beyond people who may want to read my blog (I count those the tens per month) and Bots/Crowlers/Attackers.
As the server is mine and I get to dictate the people who get to visit it, I set some simple ground rules:
- Use a modern browser
- Don’t launch probing attacks or attempt traversal attacks
- Don’t be a dick.
Based on that, I set up the following User-Agent Rewrite rules to filter out old browser, known scanners, and other bringers of mayhem.
# UA blocks
RewriteCond %{HTTP_USER_AGENT} Bytespider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Gort [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zgrab [NC,OR]
RewriteCond %{HTTP_USER_AGENT} okhttp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} python-requests [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^-?$ [OR]
RewriteCond %{HTTP_USER_AGENT} ".*MSIE.*" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libredtail-http [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GPTBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} OAI-SearchBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ChatGPT-User [NC,OR]
RewriteCond %{HTTP_USER_AGENT} CMS-Checker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} UCBrowser [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Opera/" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ".*Edge/.*" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ".*Edg/(14[0-5]|1[0-3][0-9]|[1-9][0-9]|[1-9])\..*" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ".*Chrome/(14[0-5]|14[5-6]|1[0-3][0-9]|[1-9][0-9]|[1-9])\..*" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ".*Firefox/(14[1-6]|1[0-3][0-9]|[1-9][0-9]|[1-9])\..*" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ".*Version/(2[0-5]|1[0-7]|19|[0-9])\..*Safari/.*" [NC]
RewriteRule .* - [F,L]
The items highlighted above may cause you pause, but see rule number 3 above.
You will also notice some carve-outs for odd browser versions. There are LTS versions of most browsers, and I wanted to ensure that they can get through the rewrites unscathed.
All of these rules produce HTTP/403 responses, which I am very happy with. Even with a 403 response code, so many of these keep coming back for more.
I am also amazed at how lazy some teams are with spoofing browser versions. If your are going to take the time to create a browser that is going to trawl the web for riches and data, at least take the time to try and be the most recent versions of the browser you can be, or else people like me will just shut you down.
Leave a Reply