{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiezv42ut7em3rb23uz3ettaqy3ujzr6ovwud3q2s3muvjy3vvs7r4",
"uri": "at://did:plc:ogsypfnyf6uzkppsrqgdirne/app.bsky.feed.post/3m5zfl6byh3d2"
},
"path": "/_posts/2025-03-12-self-host-wikipedia/",
"publishedAt": "2026-04-30T11:46:20.364Z",
"site": "https://nickwasused.com",
"tags": [
"@redirect_to_wikipedia_de",
"@redirect_to_wikipedia_en"
],
"textContent": "This is my method to self-hosting Wikipedia, but it's **not** a full guide!\n\n## Kiwix\n\nFirst, we need a copy of the Wikipedia `.zim` file.\n\nThen install and setup kiwix.\n\nI use the following systemd-service based on https://ounapuu.ee/posts/2021/12/09/self-hosting-wikipedia/:\n\n\n [Unit]\n Description=Serve all the ZIM files loaded on this server\n\n [Service]\n Restart=always\n RestartSec=15\n User=kiwix\n ExecStart=/usr/bin/bash -c \"kiwix-serve -t 2 -i 127.0.0.1 --port=8080 /dir_with_zims/*.zim\"\n\n [Install]\n WantedBy=network-online.target\n\n\n## nginx\n\nAfter that, I set up nginx with the following configuration:\n\n\n server {\n root /var/www/html;\n\n index index.html index.htm index.nginx-debian.html;\n\n server_name YOUR_DOMAIN;\n\n location /robots.txt {\n root /var/www/html/;\n }\n\n location / {\n proxy_pass http://127.0.0.1:8080;\n proxy_http_version 1.1;\n proxy_set_header Host $host;\n proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;\n proxy_set_header X-Real-Ip $remote_addr;\n proxy_redirect off;\n proxy_intercept_errors on;\n error_page 502 503 504 = @redirect_to_wikipedia_de;\n error_page 404 = @redirect_to_wikipedia_en;\n }\n\n location @redirect_to_wikipedia_de {\n set $last_segment '';\n if ($request_uri ~ /([^/]+)$) {\n set $last_segment $1;\n }\n return 302 https://de.wikipedia.org/wiki/$last_segment ;\n }\n location @redirect_to_wikipedia_en {\n set $last_segment '';\n if ($request_uri ~ /([^/]+)$) {\n set $last_segment $1;\n }\n return 301 https://en.wikipedia.org/wiki/$last_segment ;\n }\n }\n\n\nThis config tries to get a valid response from our local copy, but when an entry is missing, it will redirect to wikipedia.org. When the local copy is offline or errors out, then it will redirect to the German version of wikipedia.org.\n\n## Using our copy\n\n### Kagi\n\nWith Kagi you can set up a redirect with the following regex: `^https://(.*).wikipedia.org/(.*)\\/(.*)|https://YOUR_DOMAIN/wikipedia/A/$3`\n\nNotice that the path `/wikipedia` is based on the `*zim` file name, e.g. for `/wikipedia` it needs to be `wikipedia.zim`\n\n### redirector\n\nYou can set up the redirector extension the following way:\n\n# security\n\nI have blocked some cloud providers ip-ranges in a firewall.\n\n## scraping\n\nAdditionally, I have a `robots.txt` for the bots that follow that. I use this: https://robotstxt.com/ai",
"title": "Self-Host Wikipedia",
"updatedAt": "2025-03-12T00:00:00.000Z"
}