{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreibufbakyscwe24ivavslsddpde4rqtcmhd7ijtfw3vwchtntwca64",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkkxcz2sfjs2"
},
"path": "/t/on-our-project-we-need-ml-engineer-who-worked-with-azure-before/174839#post_3",
"publishedAt": "2026-04-28T14:23:33.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "Hi,\n\nI’m an AI & Full Stack Engineer with 9+ years building production ML systems in Python, focused on LLMs, OCR, and cloud deployments.\n\nI’m excited by your automated invoice processing system that combines OCR, LLM parsing, and human-in-the-loop validation. The mix of document AI plus validation to keep accuracy high really stands out, and I like that you’re shipping in imperfect infra with custom pipelines.\n\nOne idea: add a confidence-driven routing layer that scores OCR+LLM outputs and routes only low-confidence or high-risk invoices to human review. Combine token-level confidence, semantic retrieval score, and simple rule checks to create a prioritized review queue and auto-labeling for high-confidence cases. This would cut human workload, speed throughput, and feed targeted training data back into the models.\n\nAt DuploCloud I built RAG and OCR pipelines, fine-tuned LLMs, and implemented human-in-the-loop flows that reduced manual validation by ~27% and improved automation throughput by 35%. I also deployed inference microservices and CI/CD on Azure, cutting release time by ~40%, so I can help turn prototypes into reliable production services in your stack.\n\nI’d love to chat about applying this confidence-routing idea and helping ship your invoice pipeline.\n\nBest Regards,\nJames Yarris\n\nFeel free to reach out to me (james.yarris@proton.me)",
"title": "On our project we need ML Engineer, who worked with Azure before"
}