How do you handle overlapping Codex skills in larger skill catalogs?
Yes, that is a really good formulation. I think the routing trace part is especially important, because it turns the router from a black box into something inspectable.
If the system can say “I selected this skill because it owns the task, rejected this broader skill because a narrower one fully covers it, and added this support skill only for verification,” then debugging bad routing decisions becomes much easier. Without that trace, it is very hard to know whether the router made a good decision or just guessed based on similar wording.
The incomplete-manifest case is probably the hardest part, as you said. I would rather have the router be conservative there: if no skill clearly owns the task, either use no skill, ask for clarification, or fall back to a very small general workflow instead of pulling in everything that sounds related.
That is also why I think the dynamic catalog point matters so much. The router should not assume a fixed skill universe. It should be able to inspect what exists right now, reason over lightweight metadata, and then apply project-level overrides when needed.
My current “Skill Orchestrator” experiment is basically a rough userland prototype of this idea without native manifests. It tries to infer ownership from descriptions, naming, and known overlap patterns, but I agree that the cleaner long-term model would be explicit metadata like owns, does_not_own, hands_off_to, supports, and conflicts_with.
So maybe the useful prototype path is:
First, use an instruction-only router to test whether task-time selection actually improves larger skill catalogs.
Then, if the pattern proves useful, move the fragile prose-based parts into explicit skill manifests and routing traces.
That feels much more maintainable than one giant merged instruction document, while still avoiding the “load every plausible skill” problem
Discussion in the ATmosphere