{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiajyra72wurhdtf4o6x2xmq53yzta7cwx4bio5kcrgm7wz26ldyay",
"uri": "at://did:plc:4tuge3k3comfj4nfvqnwkemn/app.bsky.feed.post/3mku3vgqlqg42"
},
"path": "/user/Kamil%20Kalata/diary/408605",
"publishedAt": "2026-05-01T12:14:29.000Z",
"site": "https://www.openstreetmap.org",
"tags": [
"Taginfo",
"the Wiki",
"planet.osm.org",
"the API",
"the recent changes registry",
"“any tags you like” rule"
],
"textContent": "While browsing Taginfo I got curious how many elements have at least one key described on the Wiki and how big share of all keys the described ones make up. Therefore, I decided to check it out.\n\nThe analysis consisted of the following parts:\n\n 1. fetching OSM database dump from planet.osm.org;\n 2. fetching key statistics from Taginfo with the API;\n 3. extracting “is in Wiki” info into separate file;\n 4. altering “is in Wiki” info for keys which were described on the Wiki after the database was dumped. The alteration was based on the recent changes registry;\n 5. processing the dump with DuckDB:\n * extracting element type, its ID, and its tags to new table: `CREATE TABLE elements AS SELECT kind, id, tags FROM ST_READOSM('planet-latest.osm.pbf');`;\n * exploding keys to separate records: `CREATE TABLE elements_keys AS SELECT kind, id, UNNEST(map_keys(tags)) FROM elements;`;\n 6. querying the database.\n\n\n\nThese are queries I provided to DuckDB:\n\nResult | Query\n---|---\nnumber of all elements | `SELECT COUNT(*) FROM elements;`\nnumber of tagged elements | `SELECT COUNT(*) FROM elements WHERE tags IS NOT NULL;`\nnumber of elements with key(s) described on the Wiki | `SELECT COUNT(*) FROM (SELECT DISTINCT kind, id FROM elements_keys WHERE \"key\" IN (SELECT \"key\" FROM 'keys_wiki.csv' WHERE in_wiki));`\nnumber of all keys | `SELECT COUNT(*) FROM (SELECT DISTINCT \"key\" FROM elements_keys);`\nnumber of keys described on the Wiki | `SELECT COUNT(*) from (SELECT DISTINCT \"key\" FROM elements_keys WHERE \"key\" IN (SELECT \"key\" FROM 'keys_wiki.csv' WHERE in_wiki));`\n\nI got the following results:\n\n * all elements: 11,759,061,283\n * of which are tagged: 1,459,875,801 (12.415% of all elements)\n * of which have a key described on the Wiki: 1,459,659,709 (**99.985% of all tagged elements**)\n * all keys: 109,269\n * of which are described on the Wiki: 5,865 (**5.367% of all keys**)\n\n\n\nThe analysis provided the following conclusions:\n\n * small amount of Wiki-described keys represent nearly all tagged elements, which follows Pareto principle;\n * “any tags you like” rule does pose no significant threat to tagging consistency, since there is at least one known way to get info about virtually every element;\n * there is always a room for improvement for keys in terms of being described on the Wiki, especially for those that are a more precise version of the described ones.\n\n\n\nThe results are valid as of 20th April 2026, 12:00 AM, when the OSM database was dumped.",
"title": "Analysis of amount of elements with keys described on the Wiki"
}