{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiexblp7eju5y5gh3dy3ko7bkrnd2dayzmuqiqajvfrd54g6cljrqe",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkmn6v2urrj2"
},
"path": "/t/built-a-lane-based-dataset-bundle-explorer-for-llm-training-would-love-feedback-from-the-hf-community/175642#post_1",
"publishedAt": "2026-04-29T06:52:05.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"Dinodataset Failure Mapper - a Hugging Face Space by DinoDS",
"DinoDS (DinoDS Labs)",
"www.dinodsai.com"
],
"textContent": "Hi everyone! I’ve been building **DinoDS** , a modular dataset system for **LLM training** built around **lane-based dataset bundles**.\n\nThe idea is simple: instead of treating training data like one giant premade dump, I’m organizing it into **capability-focused bundles** that map to specific assistant behaviors and failure types — things like:\n\n * retrieval grounding\n\n * workflow / tool routing\n\n * memory and continuity\n\n * structured outputs\n\n * identity and behavior shaping\n\n\n\n\nI’ve started publishing some of these **dataset bundle previews** on Hugging Face, and I also made a **Space** that helps people explore **which dataset bundle might actually be useful for their use case**.\n\nSo the current flow is:\n\n * explore the DinoDS concept\n\n * identify what kind of assistant behavior you want to improve\n\n * see which bundle / lane family fits\n\n * check out the related dataset previews\n\n\n\n\nI’d really love feedback from the HF community on a few things:\n\n 1. Does this **bundle-first / lane-based** way of presenting datasets make sense?\n\n 2. Is the **Space + dataset bundle** flow intuitive?\n\n 3. What would make these previews more useful for people evaluating training data?\n\n 4. Would you rather explore by **failure type** , **capability** , or **use case**?\n\n\n\n\nYou can check out the bundles, the Space, and the website here:\n\n * **Hugging Face Space:** Dinodataset Failure Mapper - a Hugging Face Space by DinoDS\n\n * **Dataset bundles:** DinoDS (DinoDS Labs)\n\n * **Website:** www.dinodsai.com\n\n\n\n\nWould love thoughts, criticism, and suggestions — especially from people building assistants, copilots, routing systems, or structured-output workflows.",
"title": "Built a lane-based dataset bundle explorer for LLM training — would love feedback from the HF community"
}