{
"$type": "site.standard.document",
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreidf5qux67wmw6trqke3qnzki77fsz7iqgiwyhbyguw3rqamnn7klu"
},
"mimeType": "image/png",
"size": 122339
},
"description": "Methods relating to the control of autonomous vehicles using a reinforcement learning agent include a plurality of training sessions (110-1, ..., 110-K), in which the agent interacts with an environment, each having a different initial value and yielding a state-action quantile function…",
"path": "/patents/1404680",
"publishedAt": "2025-07-02T00:00:00.000Z",
"site": "at://did:plc:oql6ds5vnff4ugar6rruliwd/site.standard.publication/3mn3ohu7oxx5w",
"tags": [
"B60W60/001",
"VOLVO AUTONOMOUS SOLUTIONS AB [SE]"
],
"textContent": "Methods relating to the control of autonomous vehicles using a reinforcement learning agent include a plurality of training sessions (110-1, ..., 110-K), in which the agent interacts with an environment, each having a different initial value and yielding a state-action quantile function Zk,τsa=FZksa−1τ dependent on state (s) and action (a). The methods further include a first uncertainty estimation (114) on the basis of a variability measure VarτEkZk,τsa, relating to a variability with respect to quantile τ, of an average EkZk,τsa of the plurality of state-action quantile functions evaluated for a state-action pair; and a second uncertainty estimation (116) on the basis of a variability measure VarkEτZk,τsa, relating to an ensemble variability, for the plurality of state-action quantile functions evaluated for a state-action pair. The state-action pair may either correspond to a tentative decision, which is verified before execution, or to possible decisions by the agent to guide additional training.",
"title": "MANAGING ALEATORIC AND EPISTEMIC UNCERTAINTY IN REINFORCEMENT LEARNING, WITH APPLICATIONS TO AUTONOMOUS VEHICLE CONTROL"
}