Rogue Scholar has now archived 50,000 science blog posts
The science blog archive Rogue Scholar last week achieved an important milestone: archiving 50,000 science blog posts, will searchable full-text, rich metadata, and DOIs.
Rogue Scholar started to archive science blog posts in April 2023, started archiving all content with the Internet Archive Archive-It service in October 2023, started an Advisory Board in February 2024, migrated to the InvenioRDM repository platform in September 2024, launched a citation tracking service in August 2025, and started to classify all blog posts with the OpenAlex subject classification in January 2026.
These science blog posts are from 192 participating blogs, with rich metadata (ORCID and ROR, references, etc.), and currently about 180 new posts every month:
Rogue Scholar is open to scholarly content from all subject areas, but the archived content is still not a representative sample of science blogging. A future analysis will look into this in more detail, but for now I will just look at the publication date of archived posts. The years 2006-2009 have the most archived posts, with 4,515 posts archived in 2008:
The largest science blog archive is Open Access News by Peter Suber and colleagues, with 16,463 posts published between 2003 and 2010. Eighty-four percent of blog posts are written in English, other popular languages are German, Spanish and French.
The Rogue Scholar infrastructure runs on open source software, mainly the InvenioRDM repository software with lots of customizations, and an open source Python library to parse blog posts into a format that can be archived. Rogue Scholar is hosted by the German cloud provider Hetzner, and the operation cost is about 4000 € per year, with the main costs besides the data center being Crossref DOI registration and archiving with the Internet Archive Archive-It service – and of course endless hours of unpaid work.
Going forward I see these main areas of work:
- Complete the transition to the InvenioRDM repository platform. Crossref DOI registration will be part of InvenioRDM v14 released in July as a major contribution from Front Matter, and the parsing of blog posts will be migrated to a new invenio module until the end of 2026.
- Launch the Rogue Scholar non-profit membership organization. First announced in November 2025, but so far not enough progress in 2026. Membership will help recover the operating costs, and is essential for Rogue Scholar governance.
- Continue building out the Rogue Scholar community. The Slack forum is currently the main venue for this community, but might migrate to an open source platform such as Zulip going forward.
Reaching this important milestone is reason to celebrate. I look forward to a bright future for Rogue Scholar.
Please reach out with questions or comments via Slack, email, Mastodon, or Bluesky.
Rogue Scholar is a scholarly infrastructure that is free for all authors and readers. You can support Rogue Scholar with a one-time or recurring donation or by becoming a sponsor.
References
- Fenner, M. (2023, April 4). The Rogue Scholar is now open for business. Front Matter. https://doi.org/10.53731/z9v2s-bh329
- Fenner, M. (2023, October 30). Starting November, all Rogue Scholar blog posts will be archived by the Internet Archive. Front Matter. https://doi.org/10.53731/hhtx0-wb293
- Fenner, M. (2024, February 8). Introducing the Rogue Scholar Advisory Board. Front Matter. https://doi.org/10.53731/9yf86-p8541
- Fenner, M. (2024, September 2). Rogue Scholar migrates to InvenioRDM. Front Matter. https://doi.org/10.53731/sdazp-kzn55
- Fenner, M. (2025, August 4). Rogue Scholar citation tracking launches to production. Front Matter. https://doi.org/10.53731/zyg15-qv911
- Fenner, M. (2026, January 29). Rogue Scholar is improving subject classification (Version 3). Front Matter. https://doi.org/10.53731/76vm1-yme44
- Fenner, M. (2025, November 3). Rogue Scholar is becoming a German Non-Profit Organization. Front Matter. https://doi.org/10.53731/rftfk-qv692
Discussion in the ATmosphere