Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiaeayygvnyzf7q3puzausi5c5v3xsvpfs6azowieu5jveazrgyhhi",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkry2hytzfj2"
  },
  "path": "/t/is-an-agent-harness-evaluation-preprint-suitable-for-arxiv-cs-ai/175693#post_1",
  "publishedAt": "2026-05-01T09:15:14.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "https://doi.org/10.5281/zenodo.19819492",
    "https://github.com/namanvats/scaffold-effects"
  ],
  "textContent": "I’m an independent researcher working on agent systems and LLM evaluation. I recently prepared a small empirical preprint and am trying to understand the right path for sharing it with the research community.\n\nThe paper studies how different agent harnesses/scaffolds can affect measured benchmark performance and token cost under a controlled setup. It compares Goose, OpenCode, and OpenHands-SDK on a fixed Terminal-Bench-Pro task slice across two models.\n\nPaper / DOI: https://doi.org/10.5281/zenodo.19819492\nCode/repo: https://github.com/namanvats/scaffold-effects\n\nI’m currently looking for advice from people familiar with arXiv cs.AI submissions: does this look appropriately scoped for cs.AI, and what is the respectful way for a first-time independent author to handle the endorsement process?\n\nI’m not asking for a review of the paper’s claims, only for guidance on category fit and the right process.",
  "title": "Is an agent-harness evaluation preprint suitable for arXiv cs.AI?"
}