Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibufbakyscwe24ivavslsddpde4rqtcmhd7ijtfw3vwchtntwca64",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkkxcz2sfjs2"
  },
  "path": "/t/on-our-project-we-need-ml-engineer-who-worked-with-azure-before/174839#post_3",
  "publishedAt": "2026-04-28T14:23:33.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Hi,\n\nI’m an AI & Full Stack Engineer with 9+ years building production ML systems in Python, focused on LLMs, OCR, and cloud deployments.\n\nI’m excited by your automated invoice processing system that combines OCR, LLM parsing, and human-in-the-loop validation. The mix of document AI plus validation to keep accuracy high really stands out, and I like that you’re shipping in imperfect infra with custom pipelines.\n\nOne idea: add a confidence-driven routing layer that scores OCR+LLM outputs and routes only low-confidence or high-risk invoices to human review. Combine token-level confidence, semantic retrieval score, and simple rule checks to create a prioritized review queue and auto-labeling for high-confidence cases. This would cut human workload, speed throughput, and feed targeted training data back into the models.\n\nAt DuploCloud I built RAG and OCR pipelines, fine-tuned LLMs, and implemented human-in-the-loop flows that reduced manual validation by ~27% and improved automation throughput by 35%. I also deployed inference microservices and CI/CD on Azure, cutting release time by ~40%, so I can help turn prototypes into reliable production services in your stack.\n\nI’d love to chat about applying this confidence-routing idea and helping ship your invoice pipeline.\n\nBest Regards,\nJames Yarris\n\nFeel free to reach out to me (james.yarris@proton.me)",
  "title": "On our project we need ML Engineer, who worked with Azure before"
}