External Publication
Visit Post

Open source tool for analyzing your social media data (want to help me make it better)?

Hugging Face Forums [Unofficial] March 3, 2026
Source
I classified 2,500 posts from Bluesky’s 10 most-followed accounts using an open-source LLM pipeline I built called cat-vader. The classified dataset is now public on my HF profile. cat-vader is a fork of cat-llm, a package I originally built for classifying open-ended survey responses in academic research. It supports multi-label classification, automatic category discovery, and direct Threads/Bluesky API integration. Some findings from the analysis: 1. Account identity explains ~62% of engagement variance 2. Political and social content outperforms within any given account 3. Economy posts appear to tank engagement, but the effect disappears once you control for who’s posting Full writeup: What Bluesky’s Most-Followed Accounts Actually Post About - Chris Soria GitHub: GitHub - chrissoria/catvader · GitHub

Discussion in the ATmosphere

Loading comments...