MODELING AGENT DRIVING BEHAVIOR
DRIVE
April 23, 2026
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for modeling human driving behavior. One of the methods includes continually computing, at each time step, a respective observation deviation value for a current driving policy of an agent. An accumulated observation deviation value is computed including accumulating observation deviation values computed for each of a plurality of time steps. If an accumulated observation deviation value satisfies a threshold, a different policy is selected for the agent to execute after the accumulated observation deviation value satisfies the threshold.
Discussion in the ATmosphere