External Publication
Visit Post

Built a pytest-style behavioral testing package for agents. It checks tool usage, tool order, step limits, and regression against baselines. Validated on live OpenAI Agents SDK tests

OpenAI Developer Community April 28, 2026
Source

I built AgentCheck(pygent-test on PyPI), a pytest-style behavioral testing package for AI agents. Instead of checking exact text, it checks behavior: tool usage, tool order, step limits, unsupported success claims, and baseline regressions. I’ve validated it on live OpenAI Agents SDK examples, including a single-tool and multi-tool workflow. I’d love feedback from people testing real agent systems.

Discussion in the ATmosphere

Loading comments...