Raw Record Source

{
  "$type": "site.standard.document",
  "description": "The infuriating case of an incrementing GUID.",
  "path": "/2026-04-03-the-bbcs-rss-feed/",
  "publishedAt": "2026-04-03T08:51:45.421Z",
  "site": "at://did:plc:ex23caczr45rodrfcxrwps6h/site.standard.publication/self",
  "tags": [
    "bbc",
    "rss",
    "guid"
  ],
  "textContent": "Due to the incorrect way the BBC's RSS 2.0 feed handles guids, RSS readers are repeatedly left displaying duplicate articles.\n\nLet's have a look at why this happens with a sample article from their feed:\n\n<item>\n<title>\n<![CDATA[\n'We fell off the face of the earth': Dad-daughter duo who took on 7,500 miles for TV\n]]>\n</title>\n<description>\n<![CDATA[\nMolly Clifford and her father are part of this year's line up for the BBC's Race Across the World.\n]]>\n</description>\n<link>\nhttps://www.bbc.com/news/articles/c9951jrr18no?at_medium=RSS&at_campaign=rss\n</link>\n<guid isPermaLink=\"false\">https://www.bbc.com/news/articles/c9951jrr18no#3</guid>\n<pubDate>Fri, 03 Apr 2026 05:19:07 GMT</pubDate>\n<media:thumbnail width=\"240\" height=\"135\" url=\"https://ichef.bbci.co.uk/ace/standard/240/cpsprodpb/bb22/live/0bdf4fa0-2db9-11f1-934f-036468834728.jpg\"/>\n</item>\n\nSpecifically, let's focus on the guid:\n\n<guid isPermaLink=\"false\">https://www.bbc.com/news/articles/c9951jrr18no#3</guid>\n\nWhat I've seen the BBC doing is incrementing the suffix after the # and, as per the RSS 2.0 specification below, RSS readers tend to treat each incremented guid as a new entry:\n\nguid stands for globally unique identifier. It's a string that uniquely identifies the item. When present, an aggregator may choose to use this string to determine if an item is new.\n\nThe above article has been fetched by Gobbler twice and the title had changed between fetches:\n\nguid\ntitle\ncontent hash\n\nhttps://www.bbc.com/news/articles/c9951jrr18no#2\n'We fell off the face of the earth': Dad and daughter raced across world but had to keep it secret\na8159e96\n\nhttps://www.bbc.com/news/articles/c9951jrr18no#3\n'We fell off the face of the earth': Dad-daughter duo who took on 7,500 miles for TV\n17cbc6b7\n\nStrictly speaking, the RSS 2.0 specification doesn't prohibit a guid from changing. Additionally, there are no update semantics available (e.g., an updatedDate element) in the 2.0 specification. So, in this scenario with a change of title, an incremented guid is almost justifiable.\n\nHowever, this isn't always the case. Let's look at a different example in the Gobbler database:\n\nguid\ntitle\ncontent hash\n\nhttps://www.bbc.com/news/articles/cyv1q9gz39do#0\nHow English-only condolences undid one of Canada's top CEOs\n8845f9d6\n\nhttps://www.bbc.com/news/articles/cyv1q9gz39do#1\nHow English-only condolences undid one of Canada's top CEOs\n8845f9d6\n\nhttps://www.bbc.com/news/articles/cyv1q9gz39do#3\nHow English-only condolences undid one of Canada's top CEOs\n8845f9d6\n\nGobbler has fetched this article three times. The article hasn't changed at all: same title, same content, and same published date 1, all validated by the content_hash. This is simply not justifiable. There is no reason to change the guid if the article hasn't changed.\n\nWhat could the BBC do differently?\n\nFirst, don't change the guid when the article content hasn't changed. Just don't.\n\nSecond, if the article has been updated, use <atom:updated> in the <item>. The feed declares the Atom namespace and already uses it:\n\n<atom:link href=\"https://feeds.bbci.co.uk/news/uk/rss.xml\" rel=\"self\" type=\"application/rss+xml\"/>\n\nLastly, and this is a bit of a stretch goal, put the full content of each article in the feed instead of a summary.\n\nFootnotes\n\nI couldn't fit everything in the table. ↩",
  "title": "The BBC's RSS Feed"
}