Vorolonoi 逐步扩大:用于持续状态、行动和观察的POMDPs的高效在线解决器 (Voronoi Progressive Widening: Efficient Online Solvers for Continuous State, Action, and Observation POMDPs)

This paper introduces Voronoi Progressive Widening (VPW), a generalization of Voronoi optimistic optimization (VOO) and action progressive widening to partially observable Markov decision processes (POMDPs). Tree search algorithms can use VPW to effectively handle continuous or hybrid action spaces by efficiently balancing local and global action searching. This paper proposes two VPW-based algorithms and analyzes them from theoretical and simulation perspectives. Voronoi Optimistic Weighted Sparse Sampling (VOWSS) is a theoretical tool that justifies VPW-based online solvers, and it is the first algorithm with global convergence guarantees for continuous state, action, and observation POMDPs. Voronoi Optimistic Monte Carlo Planning with Observation Weighting (VOMCPOW) is a versatile and efficient algorithm that consistently outperforms state-of-the-art POMDP algorithms in several simulation experiments.

翻译：本文介绍Voronoi 进步宽广(VPW),Voronoi 乐观优化(VOOO)的概括化和逐步扩大到部分可观测的Markov决定程序(POMDPs)的行动。树搜索算法可以使用VPW,通过高效率地平衡当地和全球行动搜索,有效地处理连续或混合行动空间。本文提出基于Voronoi 的两种基于VPW的基于VOPW的算法,并从理论和模拟角度分析这些算法。Voronoioi 乐观的微粒抽样抽样(VOWSS)是一个理论工具,为基于VPW的在线解算法提供理由,它是第一个具有连续状态、行动和观察POMDPs全球趋同保证的算法。Voronooopimic Monte Carplan plan with Osurviewing (VOMCPO)是一种多功能和高效的算法,在几个模拟实验中始终优于最新POMDP算法。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【AAAI2021】Lipschitz终身强化学习

专知会员服务

31+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020-伯克利】稳定非策略强化学习的表示，Representations for Stable Off-Policy Reinforcement Learning

专知会员服务

17+阅读 · 2020年7月14日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日