{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihgzc4hhjzqlsj6q4q7wsr2xkfswcyovcu7s3k3kehcbb2ml4rjsq",
    "uri": "at://did:plc:ogsypfnyf6uzkppsrqgdirne/app.bsky.feed.post/3m5zflivfa762"
  },
  "path": "/_posts/2024-06-14-discord-archiving/",
  "publishedAt": "2026-06-10T10:31:02.863Z",
  "site": "https://nickwasused.com",
  "tags": [
    "DiscordChatExporter",
    "DiscordChatExporter-frontend",
    "server",
    "chat-analytics",
    "here",
    "format",
    "WSL",
    "Command-Line Interface",
    "./archive.sh",
    "package.json",
    "Discord Chat Exporter",
    "Discord token"
  ],
  "textContent": "(Note: A `guild` is a Discord server.)\n\n# Closed system\n\nDiscord is a social network that requires an account to use; additionally, an invitation to a server is required to see its messages based on a permission system. This means no one can request all messages on Discord. In contrast, on a forum, I could easily download a copy of all posts or make snapshots of pages with the Wayback Machine.\n\nI cannot do this on Discord, and this is a problem!\n\nLet's say we have an admin called `steve` with a forum called \"I Love USB.\" Here are two scenarios:\n\nSteve doesn't like his community anymore:\n\n  * He removes his hosted phpBB instance, then there is a good chance that a dump is on archive.org, and if not, then there would be pages saved on the Wayback Machine.\n  * He deletes the Discord server: There would be no copy on the Wayback Machine, as it cannot be saved by the crawler. Great, now nothing exists, except maybe a few screenshots of the server. Or not?\n\n\n\n# Discord Chat Exporter\n\nThe DiscordChatExporter can export whole Discord servers with threads in `json`. These files can then be uploaded to archive.org. With the DiscordChatExporter-frontend I can then view the server, like within the Discord client, but loading messages takes a little longer.\n\nExported messages are split into chunks of 10000 messages because if an export fails at a large channel, I can continue from that point. But usually, I try to make one continuous export.\n\nIn this example, a server with 1.601.026 messages is shown.\n\nOne thing I do not do is export media. Exporting all messages is already taking a while. In the case of the 1.601.026 messages, it took around 13 hours. The worst offenders in this case are three types of channels: welcome, goodbye and counting. These channels aren't that worth archiving, in my opinion, but I still archive them. (My personal, most hated channel type are the counting channels!)\n\n## Self-bot\n\nThe exporter is using my Discord account token. That means I use my account as a self-bot, but this is currently not a problem. I exported multiple large servers in a short time span and did not get banned yet. If I get banned, I will update the post!\n\n# Chat Analytics and Channels\n\nBefore uploading, I run all files through chat-analytics to generate '⁣report.html'; an example is linked here. Additionally, all channels are exported to '⁣channels.txt'; this includes all, including the ones you cannot access.\n\n# archive.org\n\nWhen archiving, I had to come up with a way to identify all the items that I and others uploaded. First, I noticed that the tag `DiscordChatExporter` was used a lot, great, so I did too. Now that everyone can simply find the items by the tag, I had to find a good way to generate the name, identifier and description.\n\nFor the identifier, I use the following format: `discord-[guild-id]-[date '+%d%m%Y')]` An example would be `discord-371265202378899476-12062024`. The limitations of this format are that a single export would be possible daily per server, but that's more than enough for most servers.\n\nThe description has the following format:\n\n\n    This contains the Discord messages in JSON format split into chunks the size of 10000 messages.\n    channels.txt contains all channels present in the discord at the given date.\n    Analytics from https://chatanalytics.app are stored in result.html.\n    Messages up to [Current Date], are saved.\n\n\n# Putting things together\n\nDoing the archiving steps manually is possible, but why would I do that. I use Linux (Ubuntu 22) on WSL, with Discord Chat Exporter, Chat Analytics, the archive.org Command-Line Interface and a custom-written script.\n\nThe script, named `archive.sh` has two parameters: the guild ID and the language in the three-letter format, e.g., `eng`, `ger`. At first, the script will check if the identifier is available, then it will get the server name with the Discord Chat Exporter and generate the identifier and description. After that, all channels are saved to the `channels.txt` file. Then the exporting begins with the following options: `exportguild -f Json --markdown false -p 10000 --include-threads All`\n\nThis will export the given server in JSON; don't process markdown, split the file at 10000 messages, and include all threads.\n\nAnd this is how I archive Discord servers.\n\n# Downsides\n\nThe major downside of this approach is that there could be a lot of duplicate data when servers are saved often, as you would export the same messages just with the new ones at the end. In the case of the example server, 13 hours + the new messages. Incremental updates with a identifier format like `discord-[guild-id]` would be great, but that would require a per-channel export and custom code. Object metadata could then be updated afterward, like the `updated` key. Additional information that the object is dynamic could then be added to the description of each object.\n\nAnother downside is that people think their messages are \"private,\" as they are not accessible on the internet. This is why I only archive the \"public\" servers, like for open-source projects or, let's say, from Twitch streamers.\n\n# Usage\n\nThe script is available at: ./archive.sh. You require:\n\n  * Linux (WSL is also ok.)\n  * node with NPM and npx\n  * the archive.org Command-Line Interface\n  * This package.json\n  * Discord Chat Exporter\n\n\n\nThe folder structure should look like this:\n\n\n    /any-folder\n        /archive.sh\n        /ia - archive.org Command-Line Interface\n        /package.json\n        /cli\n            /extract discord chat exporter here\n\n\nThen install chat-analytics by running: `npm install`. The last step is to replace `your_token` with your Discord token.\n\nYou can now archive a server like this:\n\n\n    ./archive.sh 485079746158526464 eng\n\n\n### tags\n\n#discord #archive.org #archiving",
  "title": "Archiving Discord Servers",
  "updatedAt": "2024-06-15T00:00:00.000Z"
}