Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiagfvsc5rujyxgcqxhoja6tio6mwpmv3yezldbuy47vdqo7bljhcm",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mjo6y2i3da22"
  },
  "path": "/t/bleeding-edge-tech/175319#post_3",
  "publishedAt": "2026-04-17T04:23:58.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Hello,\n\nI’ve built a face-based event registration and attendance system using InsightFace (GPU), where users self-register via camera. I’ve also implemented an agent to automate onboarding and attendance workflows.\n\nI now want to move beyond a POC into a technically challenging problem with real-world complexity.\n\nSome directions I’ve explored:\n\n  * Moving from image-based (2D) recognition to video-based (temporal / 3D understanding)\n\n  * Robustness under real-world conditions (lighting, occlusion, motion blur)\n\n  * Real-time multi-camera identity tracking across a venue\n\n  * Using LLMs to analyze event data (engagement, behavior patterns, etc.)\n\n\n\n\nHowever, these feel like incremental extensions.\n\nWhat I’m really looking for is a **hard, open problem at the intersection of vision systems, real-time inference, and agentic AI** —something where current approaches break down in practice (not just benchmarks).\n\nFor example:\n\n  * Where do current face recognition / tracking systems fail at scale in real deployments?\n\n  * Are there unsolved challenges in combining vision models with LLM-based agents for real-world decision-making?\n\n  * Any known gaps between research and production systems in this space?\n\n\n\n\nWould appreciate pointers to concrete, technically challenging problems worth tackling.\n\nThanks",
  "title": "Bleeding Edge Tech"
}