将人类反馈和知识工程学习与解决采矿手工艺中的等级任务相结合 (Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft) - 专知论文

会员服务 ·

0

知识 (knowledge) · 学成 · Engineering · 图像分类器 · Performer ·

2022 年 5 月 11 日

Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft

翻译：将人类反馈和知识工程学习与解决采矿手工艺中的等级任务相结合

Vinicius G. Goecks,Nicholas Waytowich,David Watkins-Valls,Bharat Prakash

from arxiv, Submitted to the AAAI 2022 Spring Symposium on Machine Learning and Knowledge Engineering for Hybrid Intelligence (AAAI-MAKE 2022)

Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, combined with an estimated odometry map, become a powerful state-machine designed to utilize human knowledge in a natural hierarchical paradigm. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at https://github.com/viniciusguigo/kairos_minerl_basalt.

翻译：人类可读描述通常对现实世界感兴趣的任务定义不甚明确,除非由人类设计师界定,否则没有预先界定的奖励信号。相反,数据驱动算法的设计往往旨在用能推动代理人学习的性能衡量标准解决具体、定义狭窄的任务。在这项工作中,我们提出了在2021年NeurIPS竞争MineRL BASALT挑战中获得最人性化的解决方案,在2021年NeurIPS竞争MineRL BASAL挑战中被授予最人性化的代理:从人类的反馈中学习,这要求参与者使用人类数据来解决只有自然语言描述和无报酬功能界定的四项任务。我们的方法是利用现有的人类演示数据来训练导航模拟学习政策和更多的人类反馈来训练一个图像分类师。这些模块与估计的odography地图相结合,成为一种强大的国家机器,目的是在自然等级范式中利用人类知识。我们将这种混合情报方法与终端机器学习和纯设计解决方案进行比较,然后由人类评价员来判断。代码库可在 https://github.com/vinciusguiusgo/kair_kair_bor_bas_bir_bir_bor_bor_bor_bral_bor_brus_bor_bor_bor_brus_bor_bs_bis_bis_bass_bass_bass_bus_t_t_t_t_t_bism_t_t_t_t_t_t_t_t_t_t_t_t_

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

高效率TiO2基光热协同催化剂的制备

国家自然科学基金

0+阅读 · 2012年12月31日

自旋轨道耦合对铁磁金属薄膜中反常霍尔效应的调控作用

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

ZmRop1调控玉米抗甘蔗花叶病毒的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

钙信号系统调控香蕉耐盐生理和分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型-SO3H功能化离子液体微观结构及增效机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

钾离子通道内弱相互作用网络对其离子选择性影响的研究

国家自然科学基金

0+阅读 · 2012年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

内质网分子伴侣Grp78、Grp94及CRT对PTSD海马神经元内质网径路细胞凋亡及钙稳态调控的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

TRPC和ORAI1协同构成钙池操纵的钙通道(SOC)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Towards an Architecture-centric Methodology for Migrating to Microservices

Towards an Architecture-centric Methodology for Migrating to Microservices

Arxiv

0+阅读 · 2022年7月1日

Training Novices: The Role of Human-AI Collaboration and Knowledge Transfer

Training Novices: The Role of Human-AI Collaboration and Knowledge Transfer

Arxiv

0+阅读 · 2022年7月1日

Bayesian causal inference in automotive software engineering and online evaluation

Arxiv

0+阅读 · 2022年7月1日

Targeted learning in observational studies with multi-level treatments: An evaluation of antipsychotic drug treatment safety for patients with serious mental illnesses

Targeted learning in observational studies with multi-level treatments: An evaluation of antipsychotic drug treatment safety for patients with serious mental illnesses

Arxiv

0+阅读 · 2022年6月30日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

VIP会员

文章信息

相关主题

知识 (knowledge)

图像分类器

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Towards an Architecture-centric Methodology for Migrating to Microservices

Towards an Architecture-centric Methodology for Migrating to Microservices

Arxiv

0+阅读 · 2022年7月1日

Training Novices: The Role of Human-AI Collaboration and Knowledge Transfer

Training Novices: The Role of Human-AI Collaboration and Knowledge Transfer

Arxiv

0+阅读 · 2022年7月1日

Bayesian causal inference in automotive software engineering and online evaluation

Arxiv

0+阅读 · 2022年7月1日

Targeted learning in observational studies with multi-level treatments: An evaluation of antipsychotic drug treatment safety for patients with serious mental illnesses

Targeted learning in observational studies with multi-level treatments: An evaluation of antipsychotic drug treatment safety for patients with serious mental illnesses

Arxiv

0+阅读 · 2022年6月30日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

相关基金

高效率TiO2基光热协同催化剂的制备

国家自然科学基金

0+阅读 · 2012年12月31日

自旋轨道耦合对铁磁金属薄膜中反常霍尔效应的调控作用

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

ZmRop1调控玉米抗甘蔗花叶病毒的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

钙信号系统调控香蕉耐盐生理和分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型-SO3H功能化离子液体微观结构及增效机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

钾离子通道内弱相互作用网络对其离子选择性影响的研究

国家自然科学基金

0+阅读 · 2012年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

内质网分子伴侣Grp78、Grp94及CRT对PTSD海马神经元内质网径路细胞凋亡及钙稳态调控的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

TRPC和ORAI1协同构成钙池操纵的钙通道(SOC)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员