Strategy Synthesis for Markov Decision Processes and Branching-Time Logics

by Tomáš Brázdil, Vojtěch Forejt, A full version of the paper presented at CONCUR 2007 July 2007, 28 pages.

FIMU-RS-2007-03. Available as Postscript, PDF.

Abstract:

We consider a class of finite $1\frac{1}{2}$-player games (Markov decision processes) where the winning objectives are specified in the branching-time temporal logic qPECTL$^*$ (an extension of the qualitative PCTL$^*$). We study decidability and complexity of existence of a winning strategy in these games. %The logic is more expressive than the qualitative fragment of PCTL$^*$. We identify a fragment of qPECTL$^*$ called detPECTL$^*$ for which the existence of a winning strategy is decidable in exponential time, and also the winning strategy can be computed in exponential time (if it exists). Consequently we show that every formula of qPECTL$^*$ can be translated to a formula of detPECTL$^*$ (in exponential time) so that the resulting formula is equivalent to the original one over finite Markov chains. From this we obtain that for the whole qPECTL$^*$, the existence of a winning finite-memory strategy is decidable in double exponential time. An immediate consequence is that the existence of a winning finite-memory strategy is decidable for the qualitative fragment of PCTL$^*$ in triple exponential time. We also obtain a single exponential upper bound on the same problem for the qualitative PCTL. Finally, we study the power of finite-memory strategies with respect to objectives described in the qualitative PCTL.