Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreide3dpvucgqgnemkb263eqglapdqe4zbp4wtvq64loij5ckwrhlqu",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmf63mhsqwt2"
  },
  "path": "/t/add-convence-parseembed-as-an-official-benchmark-on-the-hub-if-possible/176152#post_1",
  "publishedAt": "2026-05-21T18:39:19.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "Convence/ParseEmbed · Datasets at Hugging Face"
  ],
  "textContent": "Hi Hugging Face team,\n\nI’d like to register `Convence/ParseEmbed` as an official benchmark on the Hub.\n\nDataset: Convence/ParseEmbed · Datasets at Hugging Face\n\nIt includes a root `eval.yaml` with:\n\n  * name: ParseEmbed\n  * evaluation_framework: mteb\n  * tasks: mean, text_formatting, table\n  * config: parse-embed\n\n\n\nParseEmbed is a retrieval benchmark for embedding models. It tests whether models preserve parse-sensitive meaning under hard negatives, including semantic scope, formatting-sensitive text, and table grounding.\n\nThe dataset card documents the purpose, files, task IDs, and usage. The dataset loads with `datasets` using the `parse-embed` config and the task splits.\n\nCould you please add it to the official benchmark allow-list?\n\nThanks! if something is wrong let me know",
  "title": "Add Convence/ParseEmbed as an official benchmark on the Hub (If possible)"
}