在有正式担保的连续设置中进行反向强化学习 (Inverse Reinforcement Learning in the Continuous Setting with Formal Guarantees)

Inverse Reinforcement Learning (IRL) is the problem of finding a reward function which describes observed/known expert behavior. IRL is useful for automated control in situations where the reward function is difficult to specify manually, which impedes reinforcement learning. We provide a new IRL algorithm for the continuous state space setting with unknown transition dynamics by modeling the system using a basis of orthonormal functions. We provide a proof of correctness and formal guarantees on the sample and time complexity of our algorithm.

翻译：反强化学习(IRL)是找到一种能描述观察/已知专家行为的奖励功能的问题。在奖励功能难以手动指定的情况下,IRL对自动控制非常有用,这妨碍了强化学习。我们为连续的状态空间设置提供了一种新的IRL算法,这种空间设置不为人知的过渡动态,通过以正态功能为基础对系统进行建模。我们提供了关于我们算法样本和时间复杂性的正确性和正式保证的证明。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

132+阅读 · 2020年5月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日