Block AI crawlers, AI assistants and AI search bots with Cloudflare
Table of Contents
AI companies are stealing your website content
We have seen numerous reports in the news on how AI companies including OpenAI, Meta and Google are scraping website content without getting the consent of website owners.
Although some companies do respect rules set in robots.txt, many do not. For this reason, many businesses as well as personal artists and bloggers are naturally interested in protecting their content.
Until recently, this was pretty hard to do, as webmasters had to configure complex rules on the web servers to detect and block AI bots, while still allowing legitimate traffic to go through.
Cloudflare has now made it much easier to control what access AI bots and crawlers have to your website’s content. If you are not already using Cloudflare to protect and speed up your website, I recommend that you start doing so. You can follow my simple guide on how to add your website to Cloudflare.
Block all AI bots with one click
If you want to quickly block all AI bots and crawlers from accessing your website, you can do so with one simple click. In your Cloudflare account, go to ‘Security’ –> ‘Bots’ and activate the ‘Block AI Bots’ button as seen below:
If you want more fine-tuned blocking of AI bots
If you want more control over what bots you block, you can take advantage of the new categories of AI bots Cloudflare has created. Previously, Cloudflare only had one category, named ‘AI Crawler’, but now has added two more: ‘AI Assistant’ and ‘AI Search’.
AI Crawler
Examples of AI crawlers include GPTBot, GoogleOther and PetalBot. These are the bots that scrape your data for use in the training of new LLMs. If you don’t want your content to be in the datasets used for AI training, this is the category you want to block.
AI Assistant
This category includes bots such as ChatGPT-User. They are the user agents of AI chatbots looking to fulfil a user request in a chat. For example, a chatbot user may ask the AI a question about a specific company or webpage that the chatbot then visits to provide the answer.
Whether you want to block this category or not is a strategic decision. Without having access to your site, chatbots may not be able to provide users with information about your company or services.
AI Search
In this group, we find bots such as OAI-SearchBot (OpenAI) and Amazonbot. This is a new category of bots, developed by AI companies as tools for use in the latest product: AI search. Similar to Google or Bing, these bots crawl the web to build their own index of websites to answer search queries by users.
Again, whether you want to block this category or not is a choice you need to make. If you want to appear is searches on SearchGPT or Perplexity AI for example, you need to allow these crawlers access to your site.
If you want to learn more about Cloudflare’s category of verified bots, you can check out my full list of Cloudflare verified bots.
How to block AI bots by category on Cloudflare
Follow the steps below to create a new Cloudflare WAF rule that blocks AI bots:
- Go to ‘Security’ —> ‘WAF’ —> ‘Custom Rules’ and click ‘Create Rule’.
- Give it a rule name such as ‘block AI bots’ or similar.
- In ‘Field’ choose ‘Verified Bot Category’.
- In ‘Operator’ select ‘is in’.
- In ‘Value’ search for ‘AI’ and add any categories you want to block.
- In ‘Choose action’ select ‘Block’.
- Save the rule and you’re done!
Do you need help protecting your content?
If you feel you don’t have the time or expertise to protect your website’s content as optimally as possible, you can let me do it for you. Simply reach out!