从次优示范中学习团队政策半模拟模拟学习 (Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations) - 专知论文

会员服务 ·

0

TEAM · 学成 · Performer · 讲稿 · MoDELS ·

2022 年 5 月 11 日

Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations

翻译：从次优示范中学习团队政策半模拟模拟学习

Sangwon Seo,Vaibhav V. Unhelkar

from arxiv, Extended version of an identically-titled paper accepted at IJCAI 2022

We present Bayesian Team Imitation Learner (BTIL), an imitation learning algorithm to model behavior of teams performing sequential tasks in Markovian domains. In contrast to existing multi-agent imitation learning techniques, BTIL explicitly models and infers the time-varying mental states of team members, thereby enabling learning of decentralized team policies from demonstrations of suboptimal teamwork. Further, to allow for sample- and label-efficient policy learning from small datasets, BTIL employs a Bayesian perspective and is capable of learning from semi-supervised demonstrations. We demonstrate and benchmark the performance of BTIL on synthetic multi-agent tasks as well as a novel dataset of human-agent teamwork. Our experiments show that BTIL can successfully learn team policies from demonstrations despite the influence of team members' (time-varying and potentially misaligned) mental states on their behavior.

翻译：我们介绍贝叶斯团队模拟学习者(BTIL),这是一种模拟学习算法,用以模拟在马尔科维亚地区执行连续任务的团队的行为。与现有的多试剂模拟学习技术相比,BTIL明确模型并推断了团队成员具有时间变化的心理状态,从而能够从次优团队协作的示范中学习分散的团队政策。此外,为了从小型数据集中学习样本和标签效率高的政策,BTIL采用了巴伊西亚视角,能够从半监督的演示中学习。我们展示并衡量BTIL在合成多试剂任务方面的表现以及人类代理团队合作的新数据集。我们的实验表明,尽管团队成员(时间变化和可能错配)精神状态对其行为产生了影响,但BTIL仍然能够成功地从演示中学习团队政策。

0

相关内容

TEAM

【MIT Sam Hopkins】如何读论文？How to Read a Paper

【MIT Sam Hopkins】如何读论文？How to Read a Paper

专知会员服务

108+阅读 · 2022年3月20日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Uber AI NeurIPS 2019《元学习meta-learning》教程，附92页PPT下载

Uber AI NeurIPS 2019《元学习meta-learning》教程，附92页PPT下载

专知会员服务

113+阅读 · 2019年12月13日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

DBC-1上调TNF-α信号促进心梗/再灌后心室重构的作用和机制

国家自然科学基金

0+阅读 · 2014年12月31日

两类复杂动力学网络的建模、分析与控制

国家自然科学基金

3+阅读 · 2014年12月31日

信号稀疏表示的广义测不准原理研究

国家自然科学基金

1+阅读 · 2014年12月31日

金属嵌埋的Cu2O单晶薄膜制备及其光电性质的研究

国家自然科学基金

0+阅读 · 2013年12月31日

有序合金薄膜中结构、磁性及输运性质

国家自然科学基金

0+阅读 · 2013年12月31日

ASICs在肿瘤酸化微环境中对MDSCs抑制免疫活性的影响及其机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

改性铁基催化剂低温SCR脱硝性能优化机理

国家自然科学基金

0+阅读 · 2012年12月31日

软化学法制备无铅压电陶瓷薄膜的形成机理与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

神经病理性疼痛的脑结构和功能网络研究

国家自然科学基金

0+阅读 · 2010年12月31日

Watch and Match: Supercharging Imitation with Regularized Optimal Transport

Arxiv

0+阅读 · 2022年6月30日

The maximum capability of a topological feature in link prediction

Arxiv

0+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Modeling Teams Performance Using Deep Representational Learning on Graphs

Arxiv

0+阅读 · 2022年6月29日

A Temporal-Difference Approach to Policy Gradient Estimation

Arxiv

0+阅读 · 2022年6月28日

Approximate Inference for Stochastic Planning in Factored Spaces

Arxiv

0+阅读 · 2022年6月28日

Learning constitutive models from microstructural simulations via a non-intrusive reduced basis method: Extension to geometrical parameterizations

Arxiv

0+阅读 · 2022年6月27日

Visual Adversarial Imitation Learning using Variational Models

Arxiv

0+阅读 · 2022年6月27日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

VIP会员

文章信息

相关主题

相关VIP内容

【MIT Sam Hopkins】如何读论文？How to Read a Paper

【MIT Sam Hopkins】如何读论文？How to Read a Paper

专知会员服务

108+阅读 · 2022年3月20日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Uber AI NeurIPS 2019《元学习meta-learning》教程，附92页PPT下载

Uber AI NeurIPS 2019《元学习meta-learning》教程，附92页PPT下载

专知会员服务

113+阅读 · 2019年12月13日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

自动驾驶轨迹规划中的基础模型：进展综述与开放挑战

《用于提升多域战备的大型语言模型辅助场景生成器》报告

【斯坦福博士论文】为人类使用优化 AI 模型

国防领域人工智能规模化应用的理论与实践

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Watch and Match: Supercharging Imitation with Regularized Optimal Transport

Arxiv

0+阅读 · 2022年6月30日

The maximum capability of a topological feature in link prediction

Arxiv

0+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Modeling Teams Performance Using Deep Representational Learning on Graphs

Arxiv

0+阅读 · 2022年6月29日

A Temporal-Difference Approach to Policy Gradient Estimation

Arxiv

0+阅读 · 2022年6月28日

Approximate Inference for Stochastic Planning in Factored Spaces

Arxiv

0+阅读 · 2022年6月28日

Learning constitutive models from microstructural simulations via a non-intrusive reduced basis method: Extension to geometrical parameterizations

Arxiv

0+阅读 · 2022年6月27日

Visual Adversarial Imitation Learning using Variational Models

Arxiv

0+阅读 · 2022年6月27日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

DBC-1上调TNF-α信号促进心梗/再灌后心室重构的作用和机制

国家自然科学基金

0+阅读 · 2014年12月31日

两类复杂动力学网络的建模、分析与控制

国家自然科学基金

3+阅读 · 2014年12月31日

信号稀疏表示的广义测不准原理研究

国家自然科学基金

1+阅读 · 2014年12月31日

金属嵌埋的Cu2O单晶薄膜制备及其光电性质的研究

国家自然科学基金

0+阅读 · 2013年12月31日

有序合金薄膜中结构、磁性及输运性质

国家自然科学基金

0+阅读 · 2013年12月31日

ASICs在肿瘤酸化微环境中对MDSCs抑制免疫活性的影响及其机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

改性铁基催化剂低温SCR脱硝性能优化机理

国家自然科学基金

0+阅读 · 2012年12月31日

软化学法制备无铅压电陶瓷薄膜的形成机理与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

神经病理性疼痛的脑结构和功能网络研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员