External Publication
Visit Post

Bleeding Edge Tech

Hugging Face Forums [Unofficial] April 17, 2026
Source

Hello,

I’ve built a face-based event registration and attendance system using InsightFace (GPU), where users self-register via camera. I’ve also implemented an agent to automate onboarding and attendance workflows.

I now want to move beyond a POC into a technically challenging problem with real-world complexity.

Some directions I’ve explored:

  • Moving from image-based (2D) recognition to video-based (temporal / 3D understanding)

  • Robustness under real-world conditions (lighting, occlusion, motion blur)

  • Real-time multi-camera identity tracking across a venue

  • Using LLMs to analyze event data (engagement, behavior patterns, etc.)

However, these feel like incremental extensions.

What I’m really looking for is a hard, open problem at the intersection of vision systems, real-time inference, and agentic AI —something where current approaches break down in practice (not just benchmarks).

For example:

  • Where do current face recognition / tracking systems fail at scale in real deployments?

  • Are there unsolved challenges in combining vision models with LLM-based agents for real-world decision-making?

  • Any known gaps between research and production systems in this space?

Would appreciate pointers to concrete, technically challenging problems worth tackling.

Thanks

Discussion in the ATmosphere

Loading comments...