作者:
基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
This paper considers the variance optimization problem of average reward in continuous-time Markov decision process (MDP). It is assumed that the state space is countable and the action space is Borel measurable space. The main purpose of this paper is to find the policy with the minimal variance in the deterministic stationary policy space. Unlike the traditional Markov decision process, the cost function in the variance criterion will be affected by future actions. To this end, we convert the variance minimization problem into a standard (MDP) by introducing a concept called pseudo-variance. Further, by giving the policy iterative algorithm of pseudo-variance optimization problem, the optimal policy of the original variance optimization problem is derived, and a sufficient condition for the variance optimal policy is given. Finally, we use an example to illustrate the conclusion of this paper.
推荐文章
Distribution and assessment of hydrogeochemical processes of F-rich groundwater using PCA model: a c
Fluoride
Groundwater chemistry
PCA model
Hydrogeochemical processes
Yuncheng Basin
Zinc isotope fractionation under vaporization processes and in aqueous solutions
Evaporation process
Zinc isotope
Kinetic isotope fractionation
Equilibrium fractionation
Zinc species in solution
Hydrogeochemical processes and multivariate analysis for groundwater quality in the arid Maadher reg
Groundwater quality
Hydrogeochemical processes
Multivariate analysis
Salinity
Mio-Plio
Quaternary aquifer
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Variance Optimization for Continuous-Time Markov Decision Processes
来源期刊 统计学期刊(英文) 学科 数学
关键词 CONTINUOUS-TIME Markov Decision Process Variance OPTIMALITY of AVERAGE REWARD Optimal POLICY of Variance POLICY ITERATION
年,卷(期) 2019,(2) 所属期刊栏目
研究方向 页码范围 181-195
页数 15页 分类号 O1
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2019(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
CONTINUOUS-TIME
Markov
Decision
Process
Variance
OPTIMALITY
of
AVERAGE
REWARD
Optimal
POLICY
of
Variance
POLICY
ITERATION
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
统计学期刊(英文)
半月刊
2161-718X
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
584
总下载数(次)
0
总被引数(次)
0
论文1v1指导