External Publication
Visit Post

Browser Support • Re: Cloudflare Verification Loop issues

Pale Moon forum - Forum index [Unofficial] May 16, 2026
Source

I want to know what insights you have in a way to achieve that. Any concrete ideas.

Copy what I've done. I'll send updates if/when I make them as needed. So far this has completely solved the problem, and rarely needs updating. It was a night and day difference. A proper silver bullet for my forum.

robots.txt

CODE:

User-agent: *Disallow: /.htaccessUser-agent: AhrefsBotDisallow: /User-agent: Amzn-SearchBotDisallow: /User-agent: barkrowlerDisallow: /User-agent: BLEXBotDisallow: /User-agent: dotbotDisallow: /User-agent: MJ12botDisallow: /User-agent: ScrapyDisallow: /User-agent: SemrushBotDisallow: /User-agent: serpstatbotDisallow: /User-agent: trendictionbotDisallow: /User-agent: AI2BotDisallow: /User-agent: AmazonbotDisallow: /User-agent: Applebot-ExtendedDisallow: /User-agent: anthropic-aiDisallow: /User-agent: BytespiderDisallow: /User-agent: CCBotDisallow: /User-agent: ChatGPTDisallow: /User-agent: ChatGPT-UserDisallow: /User-agent: ClaudeBotDisallow: /User-agent: Claude-SearchBotDisallow: /User-agent: Claude-WebDisallow: /User-agent: cohere-aiDisallow: /User-agent: cohere-training-data-crawlerDisallow: /User-agent: DiffbotDisallow: /User-agent: DuckAssistBotDisallow: /User-agent: FacebookBotDisallow: /User-agent: Google-ExtendedDisallow: /User-agent: Google-CloudVertexBotDisallow: /User-agent: GPTBotDisallow: /User-agent: ImagesiftBotDisallow: /User-agent: Kangaroo BotDisallow: /User-agent: Meta-ExternalAdsDisallow: /User-agent: Meta-ExternalAgentDisallow: /User-agent: Meta-ExternalFetcherDisallow: /User-agent: Meta-WebIndexerDisallow: /User-agent: OmgilibotDisallow: /User-agent: OmgiliDisallow: /User-agent: PanguBotDisallow: /User-agent: PerplexityBotDisallow: /User-agent: TimpibotDisallow: /User-agent: Webzio-ExtendedDisallow: /User-agent: YouBotDisallow: /

.htaccess

CODE:

RewriteEngine OnRewriteBase /# Block other unwanted/annoying/useless scrapers.RewriteCond %{HTTP_USER_AGENT} AhrefsBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Amzn-SearchBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Barkrowler [NC,OR]RewriteCond %{HTTP_USER_AGENT} BLEXBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} DotBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Scrapy [NC,OR]RewriteCond %{HTTP_USER_AGENT} SemrushBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} serpstatbot [NC,OR]RewriteCond %{HTTP_USER_AGENT} trendictionbot [NC]RewriteCond %{REQUEST_URI} !=/robots.txtRewriteCond %{REQUEST_URI} !=/http_status_codes/403_forbidden.phpRewriteRule ^(.*) - [F]# Block common scrapers that use our data to feed/train their AI tools.# Ref: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-websiteRewriteCond %{HTTP_USER_AGENT} AI2Bot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Amazonbot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Applebot-Extended [NC,OR]RewriteCond %{HTTP_USER_AGENT} anthropic-ai [NC,OR]RewriteCond %{HTTP_USER_AGENT} Bytespider [NC,OR]RewriteCond %{HTTP_USER_AGENT} CCBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} ChatGPT [NC,OR]RewriteCond %{HTTP_USER_AGENT} ChatGPT-User [NC,OR]RewriteCond %{HTTP_USER_AGENT} ClaudeBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Claude-SearchBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Claude-Web [NC,OR]RewriteCond %{HTTP_USER_AGENT} cohere-ai [NC,OR]RewriteCond %{HTTP_USER_AGENT} cohere-training-data-crawler [NC,OR]RewriteCond %{HTTP_USER_AGENT} Diffbot [NC,OR]RewriteCond %{HTTP_USER_AGENT} DuckAssistBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} FacebookBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Google-Extended [NC,OR]RewriteCond %{HTTP_USER_AGENT} Google-CloudVertexBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} GPTBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} ImagesiftBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} "Kangaroo Bot" [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-ExternalAds [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-ExternalAgent [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-ExternalFetcher [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-WebIndexer [NC,OR]RewriteCond %{HTTP_USER_AGENT} Omgilibot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Omgili [NC,OR]RewriteCond %{HTTP_USER_AGENT} PanguBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} PerplexityBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Timpibot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Webzio-Extended [NC,OR]RewriteCond %{HTTP_USER_AGENT} YouBot [NC]RewriteCond %{REQUEST_URI} !=/robots.txtRewriteCond %{REQUEST_URI} !=/http_status_codes/403_forbidden.phpRewriteRule ^(.*) - [F]

If not using Apache, I'm sure I can help translate it to Nginx, or whatever this forum uses...


Discussion in the ATmosphere

Loading comments...