Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibizf4bcvz7srjdhcxqbfgcbvizlejblnsyglb22biuibywfdgf7i",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3miwp2p5dlvo2"
  },
  "path": "/t/show-and-tell-qlankr-test-a-tool-for-evaluating-ai-agents-and-rag-workflows/175061#post_1",
  "publishedAt": "2026-04-07T19:13:22.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "https://test.qlankr.com"
  ],
  "textContent": "Site: https://test.qlankr.com\n\nHi everyone,\n\nI built QLANKR Test because AI evaluation still feels too inconsistent and too dependent on guesswork.\n\nA lot of builders are shipping agents, chatbots, RAG systems, and tool-calling workflows, but the feedback loop is often messy. You tweak a prompt, change a tool, run it again, and it is not always easy to understand what actually improved.\n\nQLANKR Test is my attempt to make that process more structured.\n\nIt helps test:\n\n  * AI agents\n\n  * chatbots\n\n  * RAG systems\n\n  * tool-calling workflows\n\n\n\n\nThe goal is to make evaluation more structured, repeatable, and easier to inspect.\n\nI would especially love feedback on:\n\n  * whether the report feels useful\n\n  * whether the scoring makes sense\n\n  * what is still missing for real-world agent evaluation\n\n\n\n\nSite: https://test.qlankr.com",
  "title": "Show and Tell: QLANKR Test, a tool for evaluating AI agents and RAG workflows"
}