{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreibizf4bcvz7srjdhcxqbfgcbvizlejblnsyglb22biuibywfdgf7i",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3miwvrxa4xdp2"
},
"path": "/t/show-and-tell-qlankr-test-a-tool-for-evaluating-ai-agents-and-rag-workflows/175061#post_1",
"publishedAt": "2026-04-07T19:13:22.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"https://test.qlankr.com"
],
"textContent": "Site: https://test.qlankr.com\n\nHi everyone,\n\nI built QLANKR Test because AI evaluation still feels too inconsistent and too dependent on guesswork.\n\nA lot of builders are shipping agents, chatbots, RAG systems, and tool-calling workflows, but the feedback loop is often messy. You tweak a prompt, change a tool, run it again, and it is not always easy to understand what actually improved.\n\nQLANKR Test is my attempt to make that process more structured.\n\nIt helps test:\n\n * AI agents\n\n * chatbots\n\n * RAG systems\n\n * tool-calling workflows\n\n\n\n\nThe goal is to make evaluation more structured, repeatable, and easier to inspect.\n\nI would especially love feedback on:\n\n * whether the report feels useful\n\n * whether the scoring makes sense\n\n * what is still missing for real-world agent evaluation\n\n\n\n\nSite: https://test.qlankr.com",
"title": "Show and Tell: QLANKR Test, a tool for evaluating AI agents and RAG workflows"
}