Built a pytest-style behavioral testing package for agents. It checks tool usage, tool order, step limits, and regression against baselines. Validated on live OpenAI Agents SDK tests
OpenAI Developer Community
April 28, 2026
I built AgentCheck(pygent-test on PyPI), a pytest-style behavioral testing package for AI agents. Instead of checking exact text, it checks behavior: tool usage, tool order, step limits, unsupported success claims, and baseline regressions. I’ve validated it on live OpenAI Agents SDK examples, including a single-tool and multi-tool workflow. I’d love feedback from people testing real agent systems.
Discussion in the ATmosphere