{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibv5ane773cv5d4bzqmpfw7ilmjw2elc4wg4rjrqqygto4yea3o34",
    "uri": "at://did:plc:4rgrdigiftglskeax4wvmsev/app.bsky.feed.post/3mlkm7vv4s6w2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreiflo6xt7is6b2iafwghkjahlgggocme5jwjsbeuqqwcywuvjhmszm"
    },
    "mimeType": "image/png",
    "size": 24783
  },
  "path": "/abs/2605.07459v1",
  "publishedAt": "2026-05-11T00:00:00.000Z",
  "site": "https://arxiv.org",
  "tags": [
    "Ali Asadi",
    "Krishnendu Chatterjee",
    "Alipasha Montaseri",
    "Ali Shafiee"
  ],
  "textContent": "**Authors:** Ali Asadi, Krishnendu Chatterjee, Alipasha Montaseri, Ali Shafiee\n\nA basic model in sequential decision making is the Markov decision process (MDP), which is extended to Robust MDPs (RMDPs) by allowing uncertainty in transition probabilities and optimizing against the worst-case transition probabilities from the uncertainty sets. The class of $(s, a)$-rectangular RMDPs with $L_p$ uncertainty sets provides a flexible and expressive model for such problems. We study this class of RMDPs with a discounted-sum cost criterion and a constant discount factor. The existence of an efficient algorithm for this class is a fundamental theoretical question in optimization and sequential decision making. Previous results only establish a strongly polynomial-time algorithm for $L_\\infty$ uncertainty sets. In this work, our main results are as follows: (a)~we show that for any compact uncertainty set, the policy iteration algorithm for RMDPs is strongly polynomial with oracle access to solutions of Robust Markov chains (RMCs); (b)~we present strongly polynomial-time bounds on the policy iteration algorithm for RMCs with $L_1$ and $L_\\infty$ uncertainty sets; and (c)~we establish hardness results for RMCs with $L_p$ uncertainty sets for integer $p$ satisfying $1",
  "title": "On the Complexity of Discounted Robust MDPs with $L_p$ Uncertainty Sets"
}