{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiajyra72wurhdtf4o6x2xmq53yzta7cwx4bio5kcrgm7wz26ldyay",
    "uri": "at://did:plc:4tuge3k3comfj4nfvqnwkemn/app.bsky.feed.post/3mku3vgqlqg42"
  },
  "path": "/user/Kamil%20Kalata/diary/408605",
  "publishedAt": "2026-05-01T12:14:29.000Z",
  "site": "https://www.openstreetmap.org",
  "tags": [
    "Taginfo",
    "the Wiki",
    "planet.osm.org",
    "the API",
    "the recent changes registry",
    "“any tags you like” rule"
  ],
  "textContent": "While browsing Taginfo I got curious how many elements have at least one key described on the Wiki and how big share of all keys the described ones make up. Therefore, I decided to check it out.\n\nThe analysis consisted of the following parts:\n\n  1. fetching OSM database dump from planet.osm.org;\n  2. fetching key statistics from Taginfo with the API;\n  3. extracting “is in Wiki” info into separate file;\n  4. altering “is in Wiki” info for keys which were described on the Wiki after the database was dumped. The alteration was based on the recent changes registry;\n  5. processing the dump with DuckDB:\n     * extracting element type, its ID, and its tags to new table: `CREATE TABLE elements AS SELECT kind, id, tags FROM ST_READOSM('planet-latest.osm.pbf');`;\n     * exploding keys to separate records: `CREATE TABLE elements_keys AS SELECT kind, id, UNNEST(map_keys(tags)) FROM elements;`;\n  6. querying the database.\n\n\n\nThese are queries I provided to DuckDB:\n\nResult | Query\n---|---\nnumber of all elements | `SELECT COUNT(*) FROM elements;`\nnumber of tagged elements | `SELECT COUNT(*) FROM elements WHERE tags IS NOT NULL;`\nnumber of elements with key(s) described on the Wiki | `SELECT COUNT(*) FROM (SELECT DISTINCT kind, id FROM elements_keys WHERE \"key\" IN (SELECT \"key\" FROM 'keys_wiki.csv' WHERE in_wiki));`\nnumber of all keys | `SELECT COUNT(*) FROM (SELECT DISTINCT \"key\" FROM elements_keys);`\nnumber of keys described on the Wiki | `SELECT COUNT(*) from (SELECT DISTINCT \"key\" FROM elements_keys WHERE \"key\" IN (SELECT \"key\" FROM 'keys_wiki.csv' WHERE in_wiki));`\n\nI got the following results:\n\n  * all elements: 11,759,061,283\n    * of which are tagged: 1,459,875,801 (12.415% of all elements)\n      * of which have a key described on the Wiki: 1,459,659,709 (**99.985% of all tagged elements**)\n  * all keys: 109,269\n    * of which are described on the Wiki: 5,865 (**5.367% of all keys**)\n\n\n\nThe analysis provided the following conclusions:\n\n  * small amount of Wiki-described keys represent nearly all tagged elements, which follows Pareto principle;\n  * “any tags you like” rule does pose no significant threat to tagging consistency, since there is at least one known way to get info about virtually every element;\n  * there is always a room for improvement for keys in terms of being described on the Wiki, especially for those that are a more precise version of the described ones.\n\n\n\nThe results are valid as of 20th April 2026, 12:00 AM, when the OSM database was dumped.",
  "title": "Analysis of amount of elements with keys described on the Wiki"
}