信仰空间斯托克动态运动会 (Stochastic Dynamic Games in Belief Space) - 专知论文

会员服务 ·

0

INFORMS · 线性的 · Continuity · INTERACT · 博弈论 ·

2021 年 5 月 13 日

Stochastic Dynamic Games in Belief Space

翻译：信仰空间斯托克动态运动会

Wilko Schwarting,Alyssa Pierson,Sertac Karaman,Daniela Rus

from arxiv, Accepted in IEEE Transactions on Robotics (T-RO) 2021

Information gathering while interacting with other agents under sensing and motion uncertainty is critical in domains such as driving, service robots, racing, or surveillance. The interests of agents may be at odds with others, resulting in a stochastic non-cooperative dynamic game. Agents must predict others' future actions without communication, incorporate their actions into these predictions, account for uncertainty and noise in information gathering, and consider what information their actions reveal. Our solution uses local iterative dynamic programming in Gaussian belief space to solve a game-theoretic continuous POMDP. Solving a quadratic game in the backward pass of a game-theoretic belief-space variant of iLQG achieves a runtime polynomial in the number of agents and linear in the planning horizon. Our algorithm yields linear feedback policies for our robot, and predicted feedback policies for other agents. We present three applications: active surveillance, guiding eyes for a blind agent, and autonomous racing. Agents with game-theoretic belief-space planning win 44% more races than without game theory and 34% more than without belief-space planning.

翻译：在驾驶、服务机器人、赛跑或监视等领域,在与感知和运动不确定的其他物剂进行互动时收集信息至关重要。代理人的利益可能与他人不相符合, 从而形成一种随机不合作的动态游戏。代理人必须预测他人未来的行动而不进行交流, 将他们的行动纳入这些预测, 在信息收集中考虑到不确定性和噪音, 并考虑他们的行动所披露的信息。我们的解决方案在高西亚信仰空间使用本地迭代动态程序来解决游戏理论连续POMDP。在iLQG游戏理论信仰空间变异的后端解决一个二次游戏, 在规划范围内的代理人数量和线性方面达到一个运行时的多时段。我们的算法为我们的机器人提供线性反馈政策, 并预测其他代理人的反馈政策。我们提出三种应用: 积极的监视, 引导盲人代理人的眼睛, 和自主赛车。游戏理论信仰空间规划的代理人赢得的比赛比没有游戏理论的比赛多44%, 比没有信仰空间规划的要多34%。

1

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

The Last-Iterate Convergence Rate of Optimistic Mirror Descent in Stochastic Variational Inequalities

Arxiv

0+阅读 · 2021年7月5日

Bayesian Learning-Based Adaptive Control for Safety Critical Systems

Arxiv

0+阅读 · 2021年7月4日

Learning in nonatomic games, Part I: Finite action spaces and population games

Arxiv

0+阅读 · 2021年7月4日

Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment

Arxiv

0+阅读 · 2021年7月3日

Reinforcement Learning Provides a Flexible Approach for Realistic Supply Chain Safety Stock Optimisation

Arxiv

0+阅读 · 2021年7月2日

Gap-Dependent Bounds for Two-Player Markov Games

Arxiv

0+阅读 · 2021年7月1日

Scalar conservation laws with stochastic discontinuous flux function

Arxiv

0+阅读 · 2021年7月1日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

50+阅读 · 2020年12月20日

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

Arxiv

6+阅读 · 2018年6月7日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

The Last-Iterate Convergence Rate of Optimistic Mirror Descent in Stochastic Variational Inequalities

Arxiv

0+阅读 · 2021年7月5日

Bayesian Learning-Based Adaptive Control for Safety Critical Systems

Arxiv

0+阅读 · 2021年7月4日

Learning in nonatomic games, Part I: Finite action spaces and population games

Arxiv

0+阅读 · 2021年7月4日

Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment

Arxiv

0+阅读 · 2021年7月3日

Reinforcement Learning Provides a Flexible Approach for Realistic Supply Chain Safety Stock Optimisation

Arxiv

0+阅读 · 2021年7月2日

Gap-Dependent Bounds for Two-Player Markov Games

Arxiv

0+阅读 · 2021年7月1日

Scalar conservation laws with stochastic discontinuous flux function

Arxiv

0+阅读 · 2021年7月1日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

50+阅读 · 2020年12月20日

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

Arxiv

6+阅读 · 2018年6月7日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员