{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiadern6b2f6ylfludhzs24r6jddzbdghdiwnzxxwmzrj7wlhcwzpa",
"uri": "at://did:plc:hqad6xwuzg7oqfmwylfkvqfm/app.bsky.feed.post/3mlyun5q2adm2"
},
"path": "/viewtopic.php?t=32045&p=274156#p274156",
"publishedAt": "2026-05-16T22:11:19.000Z",
"site": "http://forum.palemoon.org",
"textContent": "> I want to know what insights you have in a way to achieve that. Any concrete ideas.\n\nCopy what I've done. I'll send updates if/when I make them as needed. So far this has completely solved the problem, and rarely needs updating. It was a night and day difference. A proper silver bullet for my forum.\n\n_robots.txt_\n\nCODE:\n\n\n User-agent: *Disallow: /.htaccessUser-agent: AhrefsBotDisallow: /User-agent: Amzn-SearchBotDisallow: /User-agent: barkrowlerDisallow: /User-agent: BLEXBotDisallow: /User-agent: dotbotDisallow: /User-agent: MJ12botDisallow: /User-agent: ScrapyDisallow: /User-agent: SemrushBotDisallow: /User-agent: serpstatbotDisallow: /User-agent: trendictionbotDisallow: /User-agent: AI2BotDisallow: /User-agent: AmazonbotDisallow: /User-agent: Applebot-ExtendedDisallow: /User-agent: anthropic-aiDisallow: /User-agent: BytespiderDisallow: /User-agent: CCBotDisallow: /User-agent: ChatGPTDisallow: /User-agent: ChatGPT-UserDisallow: /User-agent: ClaudeBotDisallow: /User-agent: Claude-SearchBotDisallow: /User-agent: Claude-WebDisallow: /User-agent: cohere-aiDisallow: /User-agent: cohere-training-data-crawlerDisallow: /User-agent: DiffbotDisallow: /User-agent: DuckAssistBotDisallow: /User-agent: FacebookBotDisallow: /User-agent: Google-ExtendedDisallow: /User-agent: Google-CloudVertexBotDisallow: /User-agent: GPTBotDisallow: /User-agent: ImagesiftBotDisallow: /User-agent: Kangaroo BotDisallow: /User-agent: Meta-ExternalAdsDisallow: /User-agent: Meta-ExternalAgentDisallow: /User-agent: Meta-ExternalFetcherDisallow: /User-agent: Meta-WebIndexerDisallow: /User-agent: OmgilibotDisallow: /User-agent: OmgiliDisallow: /User-agent: PanguBotDisallow: /User-agent: PerplexityBotDisallow: /User-agent: TimpibotDisallow: /User-agent: Webzio-ExtendedDisallow: /User-agent: YouBotDisallow: /\n\n\n\n_.htaccess_\n\nCODE:\n\n\n RewriteEngine OnRewriteBase /# Block other unwanted/annoying/useless scrapers.RewriteCond %{HTTP_USER_AGENT} AhrefsBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Amzn-SearchBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Barkrowler [NC,OR]RewriteCond %{HTTP_USER_AGENT} BLEXBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} DotBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Scrapy [NC,OR]RewriteCond %{HTTP_USER_AGENT} SemrushBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} serpstatbot [NC,OR]RewriteCond %{HTTP_USER_AGENT} trendictionbot [NC]RewriteCond %{REQUEST_URI} !=/robots.txtRewriteCond %{REQUEST_URI} !=/http_status_codes/403_forbidden.phpRewriteRule ^(.*) - [F]# Block common scrapers that use our data to feed/train their AI tools.# Ref: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-websiteRewriteCond %{HTTP_USER_AGENT} AI2Bot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Amazonbot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Applebot-Extended [NC,OR]RewriteCond %{HTTP_USER_AGENT} anthropic-ai [NC,OR]RewriteCond %{HTTP_USER_AGENT} Bytespider [NC,OR]RewriteCond %{HTTP_USER_AGENT} CCBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} ChatGPT [NC,OR]RewriteCond %{HTTP_USER_AGENT} ChatGPT-User [NC,OR]RewriteCond %{HTTP_USER_AGENT} ClaudeBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Claude-SearchBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Claude-Web [NC,OR]RewriteCond %{HTTP_USER_AGENT} cohere-ai [NC,OR]RewriteCond %{HTTP_USER_AGENT} cohere-training-data-crawler [NC,OR]RewriteCond %{HTTP_USER_AGENT} Diffbot [NC,OR]RewriteCond %{HTTP_USER_AGENT} DuckAssistBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} FacebookBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Google-Extended [NC,OR]RewriteCond %{HTTP_USER_AGENT} Google-CloudVertexBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} GPTBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} ImagesiftBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} \"Kangaroo Bot\" [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-ExternalAds [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-ExternalAgent [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-ExternalFetcher [NC,OR]RewriteCond %{HTTP_USER_AGENT} Meta-WebIndexer [NC,OR]RewriteCond %{HTTP_USER_AGENT} Omgilibot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Omgili [NC,OR]RewriteCond %{HTTP_USER_AGENT} PanguBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} PerplexityBot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Timpibot [NC,OR]RewriteCond %{HTTP_USER_AGENT} Webzio-Extended [NC,OR]RewriteCond %{HTTP_USER_AGENT} YouBot [NC]RewriteCond %{REQUEST_URI} !=/robots.txtRewriteCond %{REQUEST_URI} !=/http_status_codes/403_forbidden.phpRewriteRule ^(.*) - [F]\n\nIf not using Apache, I'm sure I can help translate it to Nginx, or whatever this forum uses...\n\n* * *",
"title": "Browser Support • Re: Cloudflare Verification Loop issues",
"updatedAt": "2026-05-16T22:11:19.000Z"
}