Block Networks That Host Bad Bots With Cloudflare Firewall

You might have seen my post on how to block bad bots that steal your content and bandwidth. However, sometimes blocking bots by looking at the user agent might not be enough. This is because the user agent can be easily spoofed, and many bots do not honestly declare themselves.

A more effective way of stopping bad bots is by blocking or challenging the networks where the bots are hosted. This will target all of them, regardless of user agent. Setting up a block or challenge is easily done using the Cloudflare WAF (Web Application Firewall).

How to look for bot traffic in a log

Bot traffic is usually easy to spot in your access logs. In the excerpt below, you can see the bot is making requests at a second apart, something a human visitor would never do. Also, you will notice that the files being requesting are old backup and configuration files that the sysadmin may have left on the server.

How to find the networks the bots are hosted on

Copy the source IP address, go to Cloudflare Radar and paste in the address in the search field. After you search for the IP, Cloudflare will provide you some info about the ASN.

How to create a firewall rule that blocks or challenges the bots

Next, go to WAF and create a new rule. It can be names anything you like. Choose “AS Num” in Field, “equals” in Operator and fill the ASN number in the Value field.

Notice I’m including an operator to disregard the rule for bots known by Cloudflare, in order to avoid blocking search engine bots, for example. It is entirely up to you, if you want to allow these. You should be aware that this list also includes AI web scrapers.

cloudflare asn waf — Cloudflare WAF rules

For the action to take, I prefer “Managed Challenge”, but you can choose “Block”, “JS Challenge”, or “Interactive Challenge” as well. You can read about the different challenge types in the Cloudflare Docs.

The reason why I choose to challenge rather than block is a precaution, in the very rare case that a human is visiting from that ASN. Usually, these networks are used for hosting servers and not for providing internet connections to people, but I prefer to be more democratic and allow humans a chance to visit.

Some common networks that often host bad bots

The networks listed below are the ones I have personally seen traffic from in my logs, but there are surely many more — these are just some of the worst offenders.

AS14618 AMAZON-AES — currently 94% bot traffic
AS136907 HWCLOUDS-AS-AP — currently 73% traffic
AS16509 AMAZON-02 — currently 86% traffic
AS210743 BABBAR-AS — currently 100% traffic

For the full list I currently challenge, take a look at this post on ASNs.

Why not just challenge all traffic besides good bots?

You can easily set up Cloudflare to challenge all incoming traffic, and some sites choose to do that. The upside of this is that it will stop 99% of all bots. The downside that human visitors will have to wait for the Cloudflare check to finish, and in some cases have to click a checkbox. This worsens the user experience

Another option is to segment your site and challenge traffic to sensitive areas only (such as logins or admin areas). This increases security in those areas will not slowing down site load for normal users.

Do you need help blocking scrapers and hacking bots from your websites?

If you have no time or interest in doing this work by yourself, I completely understand. If you choose to get my website protection service, all of this and much more is already included.

Do you know of any other networks with lots of bad bot traffic? Let me know! The networks listed above are the ones I have personally seen traffic from in my logs, but there are surely many more.

WordPress Website

Ecommerce Store

Managed Hosting

Web Analytics

Website Protection

Malware Removal

Speed Optimization

Website Migration

Block Networks That Host Bad Bots With Cloudflare Firewall

How to look for bot traffic in a log

How to find the networks the bots are hosted on

How to create a firewall rule that blocks or challenges the bots

Some common networks that often host bad bots

Why not just challenge all traffic besides good bots?

Do you need help blocking scrapers and hacking bots from your websites?

Leave a ReplyCancel Reply

Why WordPress Cache Plugins Suck and I Never Use Them

How To Set Up an SMTP Relay with Postfix and Mailgun on Ubuntu Server

Speed Optimize Your WordPress Website with Perfmatters

Block AI crawlers, AI assistants and AI search bots with Cloudflare

How to Clean Up Autoloaded Options in WordPress

WebLynx

Quick Links

Information

Newsletter

Get Started

Get Started

WordPress Website

Ecommerce Store

Managed Hosting

Web Analytics

Website Protection

Malware Removal

Speed Optimization

Website Migration

How to look for bot traffic in a log

How to find the networks the bots are hosted on

How to create a firewall rule that blocks or challenges the bots

Some common networks that often host bad bots

Why not just challenge all traffic besides good bots?

Do you need help blocking scrapers and hacking bots from your websites?

Leave a ReplyCancel Reply

Why WordPress Cache Plugins Suck and I Never Use Them

How To Set Up an SMTP Relay with Postfix and Mailgun on Ubuntu Server

Speed Optimize Your WordPress Website with Perfmatters

Block AI crawlers, AI assistants and AI search bots with Cloudflare

How to Clean Up Autoloaded Options in WordPress

WebLynx

Quick Links

Information

Newsletter