Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicvmbrmmfkcdonfn7vkwwpc2hrqlm3iogez56jdbtxvet3sbrpdaa",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mjf6dlprkr72"
  },
  "path": "/t/controlled-study-ai-operational-experience-improves-performance-by-1-07-sd-open-data-code/175226#post_1",
  "publishedAt": "2026-04-13T14:32:16.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "https://zenodo.org/records/19533311",
    "GitHub - patechlabs/aria-experience-study: Data and code for: Operational Experience as a Performance Multiplier in AI Assistants (Nuraliev & Rychkova, 2026) · GitHub"
  ],
  "textContent": "Hi everyone,\n\nWe just published a controlled experiment measuring the effect of accumulated operational experience on AI assistant performance.\n\nQuick summary:\n\n  * An AI assistant (ARIA) that has been operating for months, accumulating experience fragments and operational memory, was compared against the same base model (Claude Opus 4.6) without experience\n  * 50 real-world questions, 1,200 blind judgments from 3 independent judges\n  * Result: Cohen’s d = 1.07, Friedman p < 10^-25\n  * The effect is domain-specific — strong on operational tasks, near zero on algorithmic controls\n\n\n\nThis builds on work by ExpeL, MemGPT, Generative Agents, and Reflexion — but measures experience effects in a production system rather than a sandbox.\n\nEverything is open:\n\n  * Paper: https://zenodo.org/records/19533311\n  * Data + code: GitHub - patechlabs/aria-experience-study: Data and code for: Operational Experience as a Performance Multiplier in AI Assistants (Nuraliev & Rychkova, 2026) · GitHub\n\n\n\nWould love feedback from this community. Also seeking an arXiv cs.AI endorser if anyone is qualified — endorsement code MJLELZ.\n\nThanks!\nRavshan Nuraliev, PaTech Labs",
  "title": "Controlled study: AI operational experience improves performance by 1.07 SD (open data + code)"
}