您当前所在位置: 首页 > 学者

郭先平

  • 48浏览

  • 0点赞

  • 0收藏

  • 0分享

  • 148下载

  • 0评论

  • 引用

期刊论文

A uni edapproach to Markov decision problems andperformance sensitivity analysis with discountedandaverage criteria: multichain cases

郭先平Xi-Ren Caoa Xianping Guob

Automatica 40(2004) 1749-1759,-0001,():

URL:

摘要/描述

We propose a uni edframework to Markov decision problems andperformance sensitivity analysis for multichain Markov processes with both discounted and average-cost performance criteria. With the fundamental concept of performance potentials, we derive both performance-gradient and performance-di1erence formulas, which play the central role in performance optimization. The standard policy iteration algorithms for both discounted-andaverage-reward MDPs can be establishedusing the performance-di1erence formulas in a simple andintuitive way; andthe performance-gradient formulas together with stochastic approximation may leadto new optimization schemes. This sensitivity basedpoint of view of performance optimization provides some insights that link perturbation analysis, Markov decision processes, and reinforcement learning together. The research is an extension of the previous work on ergodic Markov chains (Cao, Automatica 36(2000)771).

【免责声明】以下全部内容由[郭先平]上传于[2006年10月12日 02时16分46秒],版权归原创者所有。本文仅代表作者本人观点,与本网站无关。本网站对文中陈述、观点判断保持中立,不对所包含内容的准确性、可靠性或完整性提供任何明示或暗示的保证。请读者仅作参考,并请自行承担全部责任。

我要评论

全部评论 0

本学者其他成果

    同领域成果