您当前所在位置: 首页 > 学者

郭先平

  • 60浏览

  • 0点赞

  • 0收藏

  • 0分享

  • 82下载

  • 0评论

  • 引用

期刊论文

OPTIMAL CONTROL OF ERGODIC CONTINUOUS-TIME MARKOV CHAINS WITH AVERAGE SAMPLE-PATH REWARDS∗

郭先平XIANPING GUO† AND XI-REN CAO‡

SIAM J. CONTROL OPTIM. Vol. 44, No.1, pp. 29-48,-0001,():

URL:

摘要/描述

In this paper we study continuous-time Markov decision processes with the average sample-path reward (ASPR) criterion and possibly unbounded transition and reward rates. We propose conditions on the system' s primitive data for the existence of-ASPR-optimal (deterministic) stationary policies in a class of randomized Markov policies satisfying some additional continuity assumptions. The proof of this fact is based on the time discretization technique, the martingale stability theory, and the concept of potential. We also provide both policy and value iteration algorithms for computing, or at least approximating, the-ASPR-optimal stationary policies. We illustrate with examples our main results as well as the dierence between the ASPR and the average expected reward criteria.

【免责声明】以下全部内容由[郭先平]上传于[2006年10月12日 02时14分49秒],版权归原创者所有。本文仅代表作者本人观点,与本网站无关。本网站对文中陈述、观点判断保持中立,不对所包含内容的准确性、可靠性或完整性提供任何明示或暗示的保证。请读者仅作参考,并请自行承担全部责任。

我要评论

全部评论 0

本学者其他成果

    同领域成果