PlaTe: 在程序任务中以变换器进行视觉环形规划 (PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks) - 专知论文

会员服务 ·

0

变换 · INTERACT · INFORMS · Performer · Better ·

2021 年 9 月 10 日

PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks

翻译：PlaTe: 在程序任务中以变换器进行视觉环形规划

Jiankai Sun,De-An Huang,Bo Lu,Yun-Hui Liu,Bolei Zhou,Animesh Garg

In this work, we study the problem of how to leverage instructional videos to facilitate the understanding of human decision-making processes, focusing on training a model with the ability to plan a goal-directed procedure from real-world videos. Learning structured and plannable state and action spaces directly from unstructured videos is the key technical challenge of our task. There are two problems: first, the appearance gap between the training and validation datasets could be large for unstructured videos; second, these gaps lead to decision errors that compound over the steps. We address these limitations with Planning Transformer (PlaTe), which has the advantage of circumventing the compounding prediction errors that occur with single-step models during long model-based rollouts. Our method simultaneously learns the latent state and action information of assigned tasks and the representations of the decision-making process from human demonstrations. Experiments conducted on real-world instructional videos and an interactive environment show that our method can achieve a better performance in reaching the indicated goal than previous algorithms. We also validated the possibility of applying procedural tasks on a UR-5 platform.

翻译：在这项工作中,我们研究了如何利用教学视频促进理解人类决策进程的问题,重点是培训一个能够从现实世界视频中规划目标导向程序的模型。从非结构化视频直接学习结构化和可规划状态和行动空间是我们任务的关键技术挑战。有两个问题:第一,培训和验证数据集之间的表面差距对于非结构化视频来说可能很大;第二,这些差距导致决定错误,使步骤复杂化。我们与规划变换器(PlaTe)解决了这些限制,规划变换器(PlaTe)的优势是绕过在长期模型推出过程中与单步模式一起出现的复合预测错误。我们的方法同时学习了分配任务的潜在状态和行动信息以及人类演示对决策进程的表述。对现实世界教学视频和互动环境的实验表明,我们的方法在实现既定目标方面可以比以往的算法取得更好的表现。我们还验证了在UR-5平台上应用程序任务的可能性。

0

相关内容

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

专知会员服务

62+阅读 · 2021年2月6日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

专知会员服务

26+阅读 · 2020年10月9日

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【IJCAI 2019】基于时间的规划:理论与实践（Timeline-based Planning: Theory and Practice），Nicola Gigante，Angelo Montanari

【IJCAI 2019】基于时间的规划:理论与实践（Timeline-based Planning: Theory and Practice），Nicola Gigante，Angelo Montanari

专知会员服务

9+阅读 · 2019年8月10日

文本+视觉，多篇 Visual/Video BERT 论文介绍

文本+视觉，多篇 Visual/Video BERT 论文介绍

AI科技评论

22+阅读 · 2019年8月30日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

Arxiv

0+阅读 · 2021年10月29日

Spatial Constraint Generation for Motion Planning in Dynamic Environments

Arxiv

0+阅读 · 2021年10月27日

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Arxiv

5+阅读 · 2021年8月11日

Path Planning using Neural A* Search

Arxiv

5+阅读 · 2021年2月8日

SBAT: Video Captioning with Sparse Boundary-Aware Transformer

Arxiv

4+阅读 · 2020年7月23日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

Music Transformer

Music Transformer

Arxiv

5+阅读 · 2018年12月12日

Automatic Face Aging in Videos via Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年11月27日

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

Arxiv

5+阅读 · 2018年4月5日

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

Arxiv

3+阅读 · 2017年11月24日

VIP会员

文章信息

相关主题

相关VIP内容

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

专知会员服务

62+阅读 · 2021年2月6日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

专知会员服务

26+阅读 · 2020年10月9日

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【IJCAI 2019】基于时间的规划:理论与实践（Timeline-based Planning: Theory and Practice），Nicola Gigante，Angelo Montanari

【IJCAI 2019】基于时间的规划:理论与实践（Timeline-based Planning: Theory and Practice），Nicola Gigante，Angelo Montanari

专知会员服务

9+阅读 · 2019年8月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

文本+视觉，多篇 Visual/Video BERT 论文介绍

文本+视觉，多篇 Visual/Video BERT 论文介绍

AI科技评论

22+阅读 · 2019年8月30日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

相关论文

Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

Arxiv

0+阅读 · 2021年10月29日

Spatial Constraint Generation for Motion Planning in Dynamic Environments

Arxiv

0+阅读 · 2021年10月27日

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Arxiv

5+阅读 · 2021年8月11日

Path Planning using Neural A* Search

Arxiv

5+阅读 · 2021年2月8日

SBAT: Video Captioning with Sparse Boundary-Aware Transformer

Arxiv

4+阅读 · 2020年7月23日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

Music Transformer

Music Transformer

Arxiv

5+阅读 · 2018年12月12日

Automatic Face Aging in Videos via Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年11月27日

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

Arxiv

5+阅读 · 2018年4月5日

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

Arxiv

3+阅读 · 2017年11月24日

微信扫码咨询专知VIP会员