Is this a world record? The GPT 5.5 ran for 19 hours
Thanks for sharing this, Sam. That actually makes sense.
I left the agent running on its own and went to sleep. Judging by the finish time, it worked for almost a full 24 hours.
When I asked it to check how much of the original task was actually completed, it estimated around 65–70%. So I guess it needs another 24 hours.
What you wrote about /goal explains the behavior pretty well. It feels less like a single huge model response and more like a goal-driven runtime that can keep working over a long session.
For my project, though, I’m using GPT-5.5 on extra high reasoning instead of medium. The codebase is around 1.4 million lines now, so I don’t really want to risk it. In these long autonomous runs, one wrong decision can hurt.
About 30% of the project was built with GPT-5.3, 40% with GPT-5.4, and the last 30% with GPT-5.5. At this scale, medium reasoning feels like a risk I don’t need to take.
Discussion in the ATmosphere