{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiagfvsc5rujyxgcqxhoja6tio6mwpmv3yezldbuy47vdqo7bljhcm",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mjo6y2i3da22"
},
"path": "/t/bleeding-edge-tech/175319#post_3",
"publishedAt": "2026-04-17T04:23:58.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "Hello,\n\nI’ve built a face-based event registration and attendance system using InsightFace (GPU), where users self-register via camera. I’ve also implemented an agent to automate onboarding and attendance workflows.\n\nI now want to move beyond a POC into a technically challenging problem with real-world complexity.\n\nSome directions I’ve explored:\n\n * Moving from image-based (2D) recognition to video-based (temporal / 3D understanding)\n\n * Robustness under real-world conditions (lighting, occlusion, motion blur)\n\n * Real-time multi-camera identity tracking across a venue\n\n * Using LLMs to analyze event data (engagement, behavior patterns, etc.)\n\n\n\n\nHowever, these feel like incremental extensions.\n\nWhat I’m really looking for is a **hard, open problem at the intersection of vision systems, real-time inference, and agentic AI** —something where current approaches break down in practice (not just benchmarks).\n\nFor example:\n\n * Where do current face recognition / tracking systems fail at scale in real deployments?\n\n * Are there unsolved challenges in combining vision models with LLM-based agents for real-world decision-making?\n\n * Any known gaps between research and production systems in this space?\n\n\n\n\nWould appreciate pointers to concrete, technically challenging problems worth tackling.\n\nThanks",
"title": "Bleeding Edge Tech"
}