基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Along with the increasing need for rescue robots in disasters such as earthquakes and tsunami, there is an urgent need to develop robotics software for learning and adapting to any environment. A reinforcement learning (RL) system that improves agents’ policies for dynamic environments by using a mixture model of Bayesian networks has been proposed, and is effective in quickly adapting to a changing environment. However, the increase in computational complexity requires the use of a high-performance computer for simulated experiments and in the case of limited calculation resources, it becomes necessary to control the computational complexity. In this study, we used an RL profit-sharing method for the agent to learn its policy, and introduced a mixture probability into the RL system to recognize changes in the environment and appropriately improve the agent’s policy to adjust to a changing environment. We also introduced a clustering distribution that enables a smaller, suitable selection, while maintaining a variety of mixture probability elements in order to reduce the computational complexity and simultaneously maintain the system’s performance. Using our proposed system, the agent successfully learned the policy and efficiently adjusted to the changing environment. Finally, control of the computational complexity was effective, and the decline in effectiveness of the policy improvement was controlled by using our proposed system.
推荐文章
Using seismic surveys to investigate sediment distribution and to estimate burial fluxes of OC, N, a
Dongfeng Reservoir
Seismic survey
Sedimentation
Nutrients burial fluxes
Distribution and assessment of hydrogeochemical processes of F-rich groundwater using PCA model: a c
Fluoride
Groundwater chemistry
PCA model
Hydrogeochemical processes
Yuncheng Basin
Spatial analysis of carbon storage density of mid-subtropical forests using geostatistics: a case st
Carbon storage density
Geostatistics
Mid-subtropical forests
Spatial autocorrelation
Spatial heterogeneity
基于Policy的网络故障管理模型研究
网络管理
故障管理
Policy
组件
模型
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 A Policy-Improving System for Adaptability to Dynamic Environments Using Mixture Probability and Clustering Distribution
来源期刊 电脑和通信(英文) 学科 医学
关键词 REINFORCEMENT Learning PROFIT-SHARING Method MIXTURE PROBABILITY CLUSTERING
年,卷(期) 2014,(4) 所属期刊栏目
研究方向 页码范围 210-219
页数 10页 分类号 R73
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2014(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
REINFORCEMENT
Learning
PROFIT-SHARING
Method
MIXTURE
PROBABILITY
CLUSTERING
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
电脑和通信(英文)
月刊
2327-5219
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
783
总下载数(次)
0
总被引数(次)
0
论文1v1指导