Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiez7p7jdkudcpsgee4uxwk2e6vxqzzga6dm3lr76p5w2hb5ydqejm",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mornxlz4jxc2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreifacpwtobu3eid2sl7woyokxcjqqzs6hoy3b5ss3sog6kocyoqjdu"
    },
    "mimeType": "image/webp",
    "size": 81386
  },
  "path": "/nitishyadav/how-ai-engines-actually-decide-what-to-cite-chatgpt-perplexity-gemini-ai-overviews-6bh",
  "publishedAt": "2026-06-21T05:19:39.000Z",
  "site": "https://dev.to",
  "tags": [
    "ai",
    "marketing",
    "seo",
    "webdev",
    "FixAEO",
    "llms.txt validator"
  ],
  "textContent": "Everyone keeps asking \"is SEO dead.\" Wrong question.\n\nAI search doesn't show ten blue links. It generates one answer and names a few brands. If you're not in that answer, you don't exist for that query. So the real question is: how do these engines decide who to name?\n\nI went down a rabbit hole on how four of them actually retrieve and cite sources. Here's what's true in 2026, with real numbers.\n\n##  ChatGPT: being known beats ranking\n\nChatGPT answers in two modes. Default mode answers from trained-in memory, no live web. Search mode browses and attaches citations. The key fact: when it browses, it cites only about **15% of the pages it pulls** (AirOps study of 548k pages). And it names brands roughly 3x more often than it links them.\n\nSo two things get you in:\n\n  * **Entity strength.** If you're a consistent entity across Wikipedia, Wikidata, Reddit and press, ChatGPT names you from memory without browsing at all. Being a known entity beats ranking #1 anywhere.\n  * **Allow OAI-SearchBot** in robots.txt. It's separate from GPTBot (training). Block it and you vanish from ChatGPT Search. A lot of sites do this by accident.\n\n\n\n##  Perplexity: it's mostly Reddit\n\nPerplexity does live retrieval and grounds every answer in sources. Its defining trait: it leans on community content hard. One 2025 study found **Reddit was its most-cited source, ~47% of top citations**. It also rewards answer-first pages, because its reranker scores for how cleanly it can extract a passage. A page can rank #1 on Google and never get cited here if the answer is buried.\n\n##  Gemini: it's basically Google + the Knowledge Graph\n\nGemini is the only major assistant running on Google's own live index plus the Knowledge Graph. So classical SEO is the floor, not optional. The twist: ranking #1 isn't enough anymore. Only about **38% of Google's AI Overview citations come from the top 10 results, down from ~76% a year earlier.** It pulls from deeper now, via sub-queries.\n\n##  Google AI Overviews: authority over freshness\n\nAI Overviews uses \"query fan-out\" - it splits your question into 8-12 sub-queries and pools the results. Most citations come from **below position #1** (roughly 63% from below the top 10). And counterintuitively, it has the **weakest freshness bias** of the major engines. Established, authoritative pages keep getting cited even without recent updates, which is the opposite of ChatGPT and Perplexity.\n\n##  What this means if you're building something\n\n  * Lead every page with the answer in the first few lines. Most AI citations come from the top of the page.\n  * Be a real entity (Wikipedia, Wikidata, Crunchbase, consistent name everywhere).\n  * Let the AI crawlers in. Check robots.txt for OAI-SearchBot, PerplexityBot, Google-Extended.\n  * Show up off your own site - Reddit and YouTube get cited constantly.\n  * Track over time, not off one screenshot. These answers are non-deterministic; the same prompt gives different brands run to run.\n\n\n\nI got tired of checking this by hand, so I built FixAEO - a free tool to see how AI engines describe and recommend your brand across 8 engines, plus a free llms.txt validator. Sharing in case it saves you the manual prompting.\n\nWhat have you noticed about getting cited by AI? Curious if others are seeing the same patterns.",
  "title": "How AI engines actually decide what to cite (ChatGPT, Perplexity, Gemini, AI Overviews)"
}