行为克隆的视频预训练潜空间搜索法 (Behavioral Cloning via Search in Video PreTraining Latent Space) - 专知论文

会员服务 ·

0

搜索 · 近似 · 演示 · 状态表示 · 控制问题 ·

2023 年 4 月 17 日

Behavioral Cloning via Search in Video PreTraining Latent Space

翻译：行为克隆的视频预训练潜空间搜索法

Federico Malato,Florian Leopold,Amogh Raut,Ville Hautamäki,Andrew Melnik

Our aim is to build autonomous agents that can solve tasks in environments like Minecraft. To do so, we used an imitation learning-based approach. We formulate our control problem as a search problem over a dataset of experts' demonstrations, where the agent copies actions from a similar demonstration trajectory of image-action pairs. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge. Then the proximity search is repeated. Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.

翻译：我们的目标是建立能够解决Minecraft等环境任务的自主智能体。为此，我们采用了一种基于模仿学习的方法。我们将控制问题简化为在专家演示数据集上进行搜索问题，其中代理复制与选定专家轨迹相似的图像-动作对的行动。我们在Video PreTraining模型的潜隐表示中对BASALT MineRL数据集进行近似搜索。只要代理的状态表示和所选的专家轨迹的距离不会偏离，代理就会复制专家轨迹上的行动。然后重复进行近似搜索。我们的方法可以有效地恢复有意义的演示轨迹，并在Minecraft环境中展示出类人的行为。

0

相关内容

互联网

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

专知会员服务

12+阅读 · 2020年3月13日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

专知会员服务

16+阅读 · 2019年12月10日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

有限半群与半群簇

国家自然科学基金

1+阅读 · 2013年12月31日

基于交互式动态影响图的未知对手模型学习

国家自然科学基金

3+阅读 · 2012年12月31日

生物分子模拟中的PDE模型与高效计算

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

RGB-D视频序列特征的深度学习模型及在人体行为识别中的应用

国家自然科学基金

1+阅读 · 2012年12月31日

可控条件下建筑环境渐近相似模拟与实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

ITS中基于有向超图的个性化的学习过程及其支持资源的优化

国家自然科学基金

0+阅读 · 2012年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

差异miRNA在子痫前期胎盘组织中的表达和功能研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于多模式触/力觉交互的坦克射手协调高速率精准调炮操作技能训练方法研究

国家自然科学基金

1+阅读 · 2008年12月31日

An Adaptive Method for Weak Supervision with Drifting Data

Arxiv

0+阅读 · 2023年6月2日

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Arxiv

0+阅读 · 2023年6月2日

Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Arxiv

0+阅读 · 2023年6月1日

Investigating Navigation Strategies in the Morris Water Maze through Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年6月1日

Inserting Anybody in Diffusion Models via Celeb Basis

Arxiv

0+阅读 · 2023年6月1日

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Arxiv

0+阅读 · 2023年5月31日

The Stable Artist: Steering Semantics in Diffusion Latent Space

Arxiv

0+阅读 · 2023年5月31日

RaSP: Relation-aware Semantic Prior for Weakly Supervised Incremental Segmentation

Arxiv

0+阅读 · 2023年5月31日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Arxiv

39+阅读 · 2021年8月30日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

专知会员服务

12+阅读 · 2020年3月13日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

专知会员服务

16+阅读 · 2019年12月10日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

人工智能赋能自主武器与人类控制第一部分：人类控制与机器学习的设计和开发 | 46页

军事指挥控制系统：2025年5种用途

人工智能赋能自主武器与人类控制第二部分：人类控制与军事指挥官 | 38页

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

An Adaptive Method for Weak Supervision with Drifting Data

Arxiv

0+阅读 · 2023年6月2日

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Arxiv

0+阅读 · 2023年6月2日

Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Arxiv

0+阅读 · 2023年6月1日

Investigating Navigation Strategies in the Morris Water Maze through Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年6月1日

Inserting Anybody in Diffusion Models via Celeb Basis

Arxiv

0+阅读 · 2023年6月1日

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Arxiv

0+阅读 · 2023年5月31日

The Stable Artist: Steering Semantics in Diffusion Latent Space

Arxiv

0+阅读 · 2023年5月31日

RaSP: Relation-aware Semantic Prior for Weakly Supervised Incremental Segmentation

Arxiv

0+阅读 · 2023年5月31日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Arxiv

39+阅读 · 2021年8月30日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

相关基金

有限半群与半群簇

国家自然科学基金

1+阅读 · 2013年12月31日

基于交互式动态影响图的未知对手模型学习

国家自然科学基金

3+阅读 · 2012年12月31日

生物分子模拟中的PDE模型与高效计算

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

RGB-D视频序列特征的深度学习模型及在人体行为识别中的应用

国家自然科学基金

1+阅读 · 2012年12月31日

可控条件下建筑环境渐近相似模拟与实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

ITS中基于有向超图的个性化的学习过程及其支持资源的优化

国家自然科学基金

0+阅读 · 2012年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

差异miRNA在子痫前期胎盘组织中的表达和功能研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于多模式触/力觉交互的坦克射手协调高速率精准调炮操作技能训练方法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员