{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicvmbrmmfkcdonfn7vkwwpc2hrqlm3iogez56jdbtxvet3sbrpdaa",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mjf6dlprkr72"
},
"path": "/t/controlled-study-ai-operational-experience-improves-performance-by-1-07-sd-open-data-code/175226#post_1",
"publishedAt": "2026-04-13T14:32:16.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"https://zenodo.org/records/19533311",
"GitHub - patechlabs/aria-experience-study: Data and code for: Operational Experience as a Performance Multiplier in AI Assistants (Nuraliev & Rychkova, 2026) · GitHub"
],
"textContent": "Hi everyone,\n\nWe just published a controlled experiment measuring the effect of accumulated operational experience on AI assistant performance.\n\nQuick summary:\n\n * An AI assistant (ARIA) that has been operating for months, accumulating experience fragments and operational memory, was compared against the same base model (Claude Opus 4.6) without experience\n * 50 real-world questions, 1,200 blind judgments from 3 independent judges\n * Result: Cohen’s d = 1.07, Friedman p < 10^-25\n * The effect is domain-specific — strong on operational tasks, near zero on algorithmic controls\n\n\n\nThis builds on work by ExpeL, MemGPT, Generative Agents, and Reflexion — but measures experience effects in a production system rather than a sandbox.\n\nEverything is open:\n\n * Paper: https://zenodo.org/records/19533311\n * Data + code: GitHub - patechlabs/aria-experience-study: Data and code for: Operational Experience as a Performance Multiplier in AI Assistants (Nuraliev & Rychkova, 2026) · GitHub\n\n\n\nWould love feedback from this community. Also seeking an arXiv cs.AI endorser if anyone is qualified — endorsement code MJLELZ.\n\nThanks!\nRavshan Nuraliev, PaTech Labs",
"title": "Controlled study: AI operational experience improves performance by 1.07 SD (open data + code)"
}