OPTIMAL CONTROL OF ERGODIC CONTINUOUS-TIME MARKOV CHAINS WITH AVERAGE SAMPLE-PATH REWARDS∗，成果详细信息-中国科技论文在线

郭先平

60浏览
0点赞
0收藏
0分享
82下载
0评论
引用

期刊论文

OPTIMAL CONTROL OF ERGODIC CONTINUOUS-TIME MARKOV CHAINS WITH AVERAGE SAMPLE-PATH REWARDS∗

郭先平， XIANPING GUO† AND XI-REN CAO‡

SIAM J. CONTROL OPTIM. Vol. 44, No.1, pp. 29-48，-0001，（）：

URL:

摘要/描述

In this paper we study continuous-time Markov decision processes with the average sample-path reward (ASPR) criterion and possibly unbounded transition and reward rates. We propose conditions on the system' s primitive data for the existence of-ASPR-optimal (deterministic) stationary policies in a class of randomized Markov policies satisfying some additional continuity assumptions. The proof of this fact is based on the time discretization technique, the martingale stability theory, and the concept of potential. We also provide both policy and value iteration algorithms for computing, or at least approximating, the-ASPR-optimal stationary policies. We illustrate with examples our main results as well as the dierence between the ASPR and the average expected reward criteria.

关键词: average sample-path reward, ， continuous-time Markov chain, ， optimal stationary policy, ， policy and value iteration algorithms

问答

暂无问题，成为第一个提问者

我要提问全部问题

【免责声明】以下全部内容由[郭先平]上传于[2006年10月12日 02时14分49秒]，版权归原创者所有。本文仅代表作者本人观点，与本网站无关。本网站对文中陈述、观点判断保持中立，不对所包含内容的准确性、可靠性或完整性提供任何明示或暗示的保证。请读者仅作参考，并请自行承担全部责任。

我要评论

全部评论 共 0 条

本学者其他成果

同领域成果