Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreigg3wev6ebwnbdewigvptbvehtx7xtkwm7intdkoe2abhowdky6za",
    "uri": "at://did:plc:jo3wjj2gx46alocis4wubmwr/app.bsky.feed.post/3mgoocz5le252"
  },
  "path": "/wiki/Wikipedia:Wikipedia_Signpost/2026-03-10/Recent_research",
  "publishedAt": "2026-03-10T00:00:00.000Z",
  "site": "https://en.wikipedia.org",
  "tags": [
    "PDF download",
    "Mastodon",
    "LinkedIn",
    "Facebook",
    "X (Twitter)",
    "Bluesky",
    "Reddit",
    "Wikimedia Research Newsletter",
    "Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics",
    "this edit",
    "call for Lightning Talk proposals",
    "research track",
    "is open",
    "Wiki Workshop",
    "announced",
    "data dumps",
    "proposal",
    "wikimedia-research",
    "IRC",
    "page of the monthly **Wikimedia Research Showcase**",
    "are always welcome",
    "a thread",
    "'As many as 5%' of new English Wikipedia articles 'contain significant AI-generated content'",
    "_Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics_",
    "10.48550/arXiv.2601.15484",
    "_Wikipedia and Grokipedia: A Comparison of Human and Generative Encyclopedias_",
    "10.48550/arXiv.2602.05519",
    "_How AI Reshapes Human Content Creation: The Case of Wikipedia_",
    "10.2139/ssrn.5853062",
    "\"How latent and prompting biases in AI-generated historical narratives influence opinions\"",
    "10.1093/pnasnexus/pgag022",
    "41783460",
    "_WikIPedia: Unearthing a 20-Year History of IPv6 Client Addressing_",
    "10.48550/arXiv.2512.08808",
    "\"Knowledge, neo-liberalism and mediatization: The crystal of Wikipedia\"",
    "10.1386/ejpc_00066_1",
    "\"The negotiation of pronominal address on talk pages of the German, French, and Italian Wikipedia\"",
    "\"Mass collaboration or curatorship? The functioning of Wikipedia needs both\"",
    "10.1108/OIR-10-2023-0515",
    "1468-4527",
    "\"WETBench: A Benchmark for Detecting Task-Specific Machine-Generated Text on Wikipedia\"",
    "10.18653/v1/2025.wikinlp-1.6",
    "\"Generative AI and Wikipedia editing: What we learned in 2025\"",
    "+ Add a comment",
    "add the page to your watchlist",
    "purging the cache",
    "leave a suggestion",
    "Suggestions"
  ],
  "textContent": "To wiki, perchance to groki: Comparisons continue.\n\n\n\n← Back to Contents\n\nView Latest Issue\n\n10 March 2026\n\n\n\n\n\nFile:Books at Kabubbu Community Library, Kabubbu, Wakiso.jpg\n\nAKibombo\n\nCC-BY-SA 4.0\n\n500\n\nRecent research\n\n## To wiki, perchance to groki\n\nContribute —\n\nShare this\n\n  * PDF download\n  * E-mail\n  * Mastodon\n  * LinkedIn\n  * Facebook\n  * X (Twitter)\n  * Bluesky\n  * Reddit\n\n\n\nBy Tilman Bayer and Mitchsavl\n\n \n\n\n\n\n\n\nA monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.\n\nGrokipedia, the AI based online encyclopedia launched XAI in October 2025 to counter perceived bias on Wikipedia, continues to attract researchers' attention (see also our previous coverage: \"Comparing comparisons of Grokipedia vs. Wikipedia by three different research teams\").\n\n###  Comparing Political Framing in Wikipedia and Grokipedia\n\n    _Reviewed byMitchsavl_\n\nA recent paper titled \"Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics\"[1] provides a comparative analysis on \"semantic framing, political orientation, and content prioritization\". The study concluded that both encyclopedias were generally left-wing, with Grokipedia showing a small right-wing bias on contentious topics. They also found that later sections within articles had greater differences than the lead.\n\n> ...these findings challenge the widespread perception of Grokipedia as an _extreme_ right-leaning encyclopedia, instead suggesting broadly comparable tendencies between the two platforms in their treatment of politically controversial topics, while still indicating a modest but consistent right-leaning bias in Grokipedia relative to Wikipedia.\n>  — Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics\n\nThe study selected six controversial topics, which were the most divisive in polling data from Gallup: abortion, cannabis legalization, climate change, gender identity, gun control, and immigration. Across all these topics, Grokipedia was determined to be shifted towards the right compared to Wikipedia, with cannabis legality and gun control averaged a right wing bias.\n\n####  \"Wikipedia and Grokipedia: A Comparison of Human and Generative Encyclopedias\"\n\n    _Reviewed byTilman Bayer_\n\nAnother recent preprint,2] by six researchers from [Sapienza University of Rome presents \"a comparative analysis of Wikipedia and Grokipedia\" based on a much larger sample, finding that\n\n> \"Inclusion is non-uniform: pages with higher visibility and greater editorial conflict in Wikipedia are more likely to appear in Grokipedia. For included pages, we distinguish between verbatim reproduction and generative rewriting. Rewriting is more frequent for pages with higher reference density and recent controversy, while highly popular pages are more often reproduced without modification. [...] Across multiple topical domains, including U.S. politics, geopolitics, and conspiracy-related narratives, narrative structure remains largely consistent between the two sources. Analysis of lead sections shows broadly correlated framing, with localized shifts in laudatory and conflict-oriented language for some topics in Grokipedia.\"\n\nLike the Cornell researchers whose paper was covered in our previous issue, the authors detected Wikipedia-sourced articles when scraping Grokipedia:\n\n> We consider a Grokipedia page to be not rewritten if it contains the standard Creative Commons footer. Specifically, we determine whether a Grokipedia article is kept unchanged by checking for the presence of the following text at the bottom of the page: “The content is adapted from Wikipedia, licensed under Creative Commons Attribution-ShareAlike 4.0 License”.\n\nHowever, their assumption that those articles are \"not rewritten\" are somewhat in contrast to findings of the Cornell team, who calculated a \"mean chunk similarity\" score between corresponding Grokipedia and Wikipedia articles which at 0.90 was higher than for those articles without that footer, but still below a perfect 1.0 similarity score.\n\n  \"Content framing scores in Grokipedia and Wikipedia articles across U.S. Politics, Geopolitics, and Conspiracy-related pages. Top: fraction of sentences in the lead section that show praise, admiration, or glorification. Bottom: fraction of sentences in the lead section that focus on disputes, disagreements, or controversies. Color intensity is proportional to the difference between the two fractions, while point shape for pages in U.S. Politics refers to their political leaning. The dashed line represents the quadrant bisector, corresponding to an equal fraction on both platforms. Only a subset of pages is labeled for visual clarity, and among these, some are shortened to improve readability. While scores tend to be weakly or moderately correlated, noteworthy outliers emerge, especially among U.S. Politics pages.\" (Figure 4 from the paper)\n\n\n\n\n###  \"Grokipedia increases human editing activity\" on Wikipedia\n\n    _Reviewed byTilman Bayer_\n\nA preprint titled _How AI Reshapes Human Content Creation: The Case of Wikipedia_3] by two economists from [Wake Forest University offers a surprising conclusion:\n\n> \"We [...] examin[e] the short-run impact of the introduction of Grokipedia, an AI-generated online encyclopedia operated by xAI, which provides automated summaries that could either substitute for human editing or draw in new contributors. We develop a simple theoretical framework in which AI entries can redirect user attention and stimulate human editing through novel framing, yielding ambiguous effects on UGC [user generated content] ex ante. Using a new panel dataset covering 1.4 million Wikipedia pages of notable individuals, we exploit the fact that only a subset have comparable Grokipedia entries to estimate the causal effect of AI on subsequent human contributions, constructing matched samples of treated and untreated pages within occupational fields. We find a consistent and surprising result: _the availability of Grokipedia increases human editing activity_. Page views also rise, suggesting that AI entries act as an attention amplifier rather than a pure substitute for Wikipedia content. Exploiting variation in the semantic similarity between Grokipedia entries and their corresponding Wikipedia articles, we further show that pages with _lower_ similarity experience _larger_ increases in editing after Grokipedia’s launch, consistent with the model’s predictions. [...]\"\n\nThe paper's \"Introduction\" section points out that\n\n> Theoretically, the effect of Grokipedia on Wikipedia’s UGC is ambiguous. AI may act as a substitute: if users rely on Grokipedia entries instead of Wikipedia, the reduced traffic and diminished perceived value of contributing may depress human editing activity. But AI may also act as a complement: users may draw information from Grokipedia to improve or update Wikipedia pages, or the publicity surrounding a new AI platform may direct attention toward existing Wikipedia entries, leading to more edits. The competition for viewers may also elicit greater effort from Wikipedia contributors. Which force dominates is ultimately an empirical question.\n\n(The authors cite  this edit as a concrete example of how information on Grokipedia may inspire activity on the corresponding Wikipedia article.)\n\nThe paper's statistical analysis focuses on\n\n> a new panel dataset of approximately 1.4 million Wikipedia pages of notable individuals across five occupational domains—Academia, Culture, Leaders, Politics, and Sports. Only about 170,000 of these pages (roughly 12%) are covered by Grokipedia at launch, while the remaining 88% do not receive an AI-generated entry.\n\nTo assess the impact of Grokipedia's October 2025 launch on these Wikipedia articles, the authors compare views and edits as follows:\n\n> Because Grokipedia coverage is concentrated among highly visible pages and baseline visibility varies systematically across occupations, we construct matched samples using Mahalanobis-distance nearest-neighbor matching within occupational fields. Treated pages—those with a Grokipedia entry—are paired with the closest untreated pages based on pre-treatment views, editing histories, long-run readership, and page characteristics, approximating the counterfactual trajectory each treated page would have followed absent Grokipedia. We then estimate treatment effects using a Difference-in-Differences framework, which compares changes in views and edits before and after Grokipedia’s launch between treated pages and their matched controls.\n\nThe analysis of post-launch views and edits is confined to a rather short timespan of just three weeks (October 27–November 16). The authors justify this \"focus on short-run outcomes\" by observing that\n\n> Beginning on October 27, 2025—the day the platform went live—traffic spiked abruptly, reaching over 500,000 daily visits worldwide during its first week, with more than 100,000 daily visits per day originating from the United States alone. Peak attention occurred immediately after launch, exceeding two million global visits on October 28, before declining in the following days.\n\n### Briefly\n\n  * A  call for Lightning Talk proposals has been posted for the research track of this year's Wikimania conference (to take place 21-25 July, 2026 in Paris, France, as an in-person and online event). Submission deadline: March 31, 2026.\n  * Registration  is open for Wiki Workshop 2026, \"the annual forum bringing together researchers exploring all aspects of Wikimedia projects\" (March 25-26, 2026, online, free of charge).\n  * The Wikimedia Foundation's Data Engineering team  announced two new monthly data dumps that provide unparsed content from Wikimedia project wikis in XML format. (These dumps remain freely available to the public, unlike the paid offerings of Wikimedia Enterprise.)\n  * A  proposal to shut down the #wikimedia-research IRC channel resulted in a \"consensus we should stop sending new folks there and requiring the [Wikimedia Foundation'] research team to maintain it\", but also in a decision to keep it open for now, with operation handed over to volunteers.\n  * See the page of the monthly **Wikimedia Research Showcase** for videos and slides of past presentations.\n\n\n\n### Other recent publications\n\n_Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research,are always welcome._\n\n####  AI summaries \"led to more liberal opinions compared with Wikipedia\"\n\nFrom the abstract:[4]\n\n> \"Participants read Wikipedia or GPT-4o summaries of two historical events the Seattle General Strike and the [Third World Liberation Front strikes of 1968], with AI summaries maintaining factual accuracy while exhibiting different types of framing biases. Default AI summaries led to more liberal opinions compared with Wikipedia, demonstrating the persuasive capability of LLM's latent biases. Summaries purposefully induced with a liberal framing also led to more liberal opinions, regardless of readers’ ideologies. Summaries constructed with a conservative framing produced conservative shifts primarily among conservative readers.\"\n\nSee also a thread by one of the paper's authors\n\n\n\n\n####  \"WikIPedia: Unearthing a 20-Year History of IPv6 Client Addressing\"\n\nFrom the abstract:[5]\n\n> \"When Wikimedia users make edits without signing into an account, their IP addresses are used in lieu of a username. Wikimedia site dumps therefore provide researchers with over two decades worth of timestamped client IPv6 addresses to understand address assignments and how they have changed over time and space.\n>  In this work, we extract 19M unique IPv6 addresses from Wikimedia sites like Wikipedia that were used by editors from 2003 to 2024. We use these addresses to understand the prevalence of IPv6 in countries corresponding to Wikimedia site languages, how IPv6 adoption has grown over time, and the prevalence of EUI-64 addressing on client devices like desktops, laptops, and mobile phones.\"\n\nFrom the paper:\n\n> \"The majority (∼64%) of the IPv6 addresses that are logged in Wikimedia edits appear only once\"\n\n\n\n\n####  \"Knowledge, neo-liberalism and mediatization: The crystal of Wikipedia\"\n\nFrom the abstract:[6]\n\n> \"This article presents neo-liberal notions of knowledge and market and explains why this is important for the functioning of digital platforms. Neo-liberals are concerned with everyday knowledge of the common people, their mental states and feelings, not intellectual knowledge. ...] [Hayek defines market as a communication system that is digesting dispersed information. Millions of minds are doing data generation and processing. That way, neo-liberals see all digital platforms, including Wikipedia, as markets. Classical encyclopaedias are centrally controlled and expert driven, while neo-liberal markets create knowledge through crowds’ ‘voluntary exchange’ and ‘spontaneous cooperation’. The fundamental difference is that encyclopaedias were an Enlightenment project, while Wikipedia is producing recycled intellectual and layman’s knowledge without any political or revolutionary engagement.\"\n\n\n\n\n####  \"The negotiation of pronominal address on talk pages of the German, French, and Italian Wikipedia\"\n\nThis paper found that German and Italian \"wikiquette\" stipulates the informal \"du\"/\"tu\" among editors (instead of the more formal \"Sie\"/\"Lei\"), whereas French Wikipedia lacks consensus on \"vous\" vs. \"tu\". From the abstract:[7]\n\n> \"This paper asks ...] how the appropriate use of address pronouns is negotiated on talk pages of the German, French, and Italian Wikipedia. The talk pages of Wikipedia share features of CMC [[computer-mediated communication] genres such as a dialogic structure and an informal writing style with non-standard language. There are two types of Wikipedia talk pages, whose data are considered in this study based on the multilingual corpora by the Leibniz Institute for the German Language: article talk pages, where authors negotiate online encyclopedic content, and user talk pages, where the contributions of individual authors are discussed. These two types of talk pages will be analysed for the study.\"\n\n\n\n\n####  \"Mass collaboration or curatorship? The functioning of Wikipedia needs both\"\n\nFrom the abstract:[8]\n\n> \"Using the complete dataset of the English, Spanish, and Italian versions of Wikipedia (2001–2020), we analyzed metrics such as the number of articles, creations, or edits performed by users. We calculated their distributions, adapted the Gini index to measure participation inequalities and employed network science methods to understand user-edit interactions. [...]\n>\n>\n>  Our analysis confirms significant disparities in content generation and engagement, emphasizing content editing. However, we demonstrate that these differences coexist with extensive collaboration. Specifically, our findings reveal that disparities in participation levels and collaborative editing complement each other. Curatorial leadership by a central group of contributors is extremely collaborative, while occasional contributors intervene flexibly in specific contexts.\"\n\nFrom the \"Conclusion\" section:\n\n> \"Our first result has been to confirm the existence of strong inequalities in content production and participation that were already highlighted in the literature, focusing in particular on content editing. However, we have also shown how, despite such forms of gatekeeping of the core group of contributors, Wikipedia’s users who manage the vast majority of the edits carry out this task in a largely collaborative manner. That is to say, our analysis suggests that inequalities in the level of participation and high levels of collaboration are not antithetical, but rather mutually reinforcing building blocks of Wikipedia.\"\n\n\n\n\n####  \"WETBench: A Benchmark for Detecting Task-Specific Machine-Generated Text on Wikipedia\"\n\nFrom the abstract:[9]\n\n> We introduce WETBench, a multilingual, multi-generator, and task-specific benchmark for MGT detection. We define three editing tasks empirically grounded in Wikipedia editors’ perceived use cases for LLM-assisted editing: Paragraph Writing, Summarisation, and Text Style Transfer, which we implement using two new datasets across three languages. For each writing task, we evaluate three prompts, produce MGT across multiple generators using the best-performing prompt, and benchmark diverse detectors.We find that, across settings, training-based detectors achieve an average accuracy of 78%, while zero-shot detectors average 58%. These results demonstrate that detectors struggle with MGT in realistic generation scenarios [...]\n\nFrom the \"Experimental setup\" section:\n\n> We generate MGT using four multilingual models from two families: proprietary and open-weight. [...] For proprietary models, we use GPT4o mini [...] and Gemini 2.0 Flash. For openweight models, we select Qwen2.5-7B-Instruct and Mistral-7B-Instruct. [...]\n>\n>\n> We evaluate six detectors from three different families: [...] Specifically, we use XLM-RoBERTa [...] and mDeBERTa [...] as training-based detectors, which we fine-tune with hyperparameter search; Binoculars [...], LLR [...], and FastDetectGPT (White-Box) [...] as zero-shot white-box detectors; and Revise-Detect [...], GECScore [...], and FastDetectGPT (Black-Box) [...] as zero-shot black-box detectors.\n\nThe proprietary detectors Pangram and GPTZero are not mentioned in the paper. (A recent investigation by Wiki Edu \"found [Pangram] to be highly accurate for Wikipedia text\".[supp 1])\n\nSee also our review of an earlier paper by different authors that had relied on Binoculars (and GPTZero) for its conclusions: \"'As many as 5%' of new English Wikipedia articles 'contain significant AI-generated content'\"\n\n### References\n\n  1. **^** Eibl, Philipp; Coppolillo, Erica; Mungari, Simone; Luceri, Luca (2026-01-21), _Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics_, arXiv, doi:10.48550/arXiv.2601.15484\n  2. **^** Hadad, Ortal; Loru, Edoardo; Nudo, Jacopo; Bonetti, Anita; Cinelli, Matteo; Quattrociocchi, Walter (2026-02-05), _Wikipedia and Grokipedia: A Comparison of Human and Generative Encyclopedias_, arXiv, doi:10.48550/arXiv.2602.05519\n  3. **^** Leung, Tin Cheuk; Strumpf, Koleman S. (2025-12-03), _How AI Reshapes Human Content Creation: The Case of Wikipedia_, Rochester, NY: Social Science Research Network, doi:10.2139/ssrn.5853062 (\"Last revised: 30 Jan 2026\")\n  4. **^** Shu, Matthew; Karell, Daniel; Okura, Keitaro; Davidson, Thomas R. (2026-02-27). \"How latent and prompting biases in AI-generated historical narratives influence opinions\". _PNAS Nexus_. **5** (3). doi:10.1093/pnasnexus/pgag022. PMID 41783460.`{{cite journal}}`: CS1 maint: unflagged free DOI (link)\n  5. **^** Rye, Erik; Levin, Dave (2025-12-09), _WikIPedia: Unearthing a 20-Year History of IPv6 Client Addressing_, arXiv, doi:10.48550/arXiv.2512.08808\n  6. **^** Mlađenović, Nikola (2025-10-31). \"Knowledge, neo-liberalism and mediatization: The crystal of Wikipedia\". _Empedocles: European Journal for the Philosophy of Communication_. doi:10.1386/ejpc_00066_1. \n  7. **^** Flinz, Carolina; Gredel, Eva; Herzberg, Laura (2025-06-30). \"The negotiation of pronominal address on talk pages of the German, French, and Italian Wikipedia\". _Exploring digitally-mediated communication with corpora_. De Gruyter.\n  8. **^** Pilati, Federico; Sacco, Pier Luigi; Artime, Oriol (2025-08-05). \"Mass collaboration or curatorship? The functioning of Wikipedia needs both\". _Online Information Review_. **49** (8): 122–133. doi:10.1108/OIR-10-2023-0515. ISSN 1468-4527.\n  9. **^** Quaremba, Gerrit; Black, Elizabeth; Vrandecic, Denny; Simperl, Elena (August 2025). \"WETBench: A Benchmark for Detecting Task-Specific Machine-Generated Text on Wikipedia\". _Proceedings of the 2nd Workshop on Advancing Natural Language Processing for Wikipedia (WikiNLP 2025)_. Vienna, Austria: Association for Computational Linguistics. pp. 10–30. doi:10.18653/v1/2025.wikinlp-1.6. ISBN 9798891762848. `{{cite conference}}`: Unknown parameter `|editors=` ignored (`|editor=` suggested) (help)\n\n\n\n    Supplementary references and notes:\n\n  1. **^** Davis, LiAnna (2026-01-29). \"Generative AI and Wikipedia editing: What we learned in 2025\". _Wiki Education_.\n\n\n\n\n\n\n← Previous \"Recent research\"\n\n In this issue\n\n10 March 2026 (all comments)\n\n\n  * Interview\n  * News and notes\n  * Special report\n  * In the media\n  * Recent research\n  * Obituary\n  * Opinion\n  * Technology report\n  * Op-ed\n  * Essay\n  * In focus\n  * WikiProject report\n  * Community view\n  * Traffic report\n  * Crossword\n  * Comix\n\n\n\n + Add a comment\n\n## Discuss this story\n\nTo follow comments,  add the page to your watchlist. If your comment has not appeared here, you can try  purging the cache.\n\nNo comments yet. Yours could be the first!\n\n + Add a comment\n\n\n\n\n\n\n\nMake sure we cover what matters to you –  leave a suggestion.\n\nHome\n\nAbout\n\nArchives\n\nNewsroom\n\nSubscribe\n\n Suggestions",
  "title": "Wikipedia:Wikipedia Signpost/2026-03-10/Recent research"
}