展望世界:为基于文字的游戏提供问题指导的加强学习 (Perceiving the World: Question-guided Reinforcement Learning for Text-based Games) - 专知论文

会员服务 ·

0

学成 · 强化学习 · 样本 · INTERACT · Performer ·

2022 年 4 月 21 日

Perceiving the World: Question-guided Reinforcement Learning for Text-based Games

翻译：展望世界:为基于文字的游戏提供问题指导的加强学习

Yunqiu Xu,Meng Fang,Ling Chen,Yali Du,Joey Tianyi Zhou,Chengqi Zhang

from arxiv, ACL2022, fix some typos

Text-based games provide an interactive way to study natural language processing. While deep reinforcement learning has shown effectiveness in developing the game playing agent, the low sample efficiency and the large action space remain to be the two major challenges that hinder the DRL from being applied in the real world. In this paper, we address the challenges by introducing world-perceiving modules, which automatically decompose tasks and prune actions by answering questions about the environment. We then propose a two-phase training framework to decouple language learning from reinforcement learning, which further improves the sample efficiency. The experimental results show that the proposed method significantly improves the performance and sample efficiency. Besides, it shows robustness against compound error and limited pre-training data.

翻译：以文字为基础的游戏是研究自然语言处理的一种互动方式。虽然深层强化学习在开发游戏工具方面显示出了实效,但低抽样效率和大行动空间仍然是阻碍DRL在现实世界应用的两大挑战。在本文中,我们通过引入世界感知模块来应对挑战,这些模块通过回答环境问题自动分解任务和行动。然后我们提出一个两阶段培训框架,将语言学习与强化学习脱钩,从而进一步提高样本效率。实验结果显示,拟议方法极大地改进了绩效和样本效率。此外,它显示了抵御复合错误和有限培训前数据的力度。

0

相关内容

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于"Build-and-Click"法的铂类RNA聚合酶I选择性抑制剂的构建、评价及亚细胞定位研究

国家自然科学基金

1+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

粘弹性Oldroyd型流体运动方程长时间行为的快速逼近算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Haccpper环境中不锈钢表面活性与电化学噪声特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

低维纳米尺度金属（Cu、Ni）的腐蚀行为及电化学特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

严重烧伤脓毒症胰腺损伤与糖代谢障碍的关系与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

水下重力梯度辅助惯性导航匹配算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于柔性多体动力学的风力发电传动系统可靠性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy

Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy

Arxiv

0+阅读 · 2022年6月10日

Meta-Reinforcement Learning with Self-Modifying Networks

Meta-Reinforcement Learning with Self-Modifying Networks

Arxiv

1+阅读 · 2022年6月10日

Mildly Conservative Q-Learning for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月9日

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月9日

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Arxiv

0+阅读 · 2022年6月8日

MIX-MAB: Reinforcement Learning-based Resource Allocation Algorithm for LoRaWAN

Arxiv

0+阅读 · 2022年6月7日

Driving in Real Life with Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月7日

Efficient entity-based reinforcement learning

Arxiv

0+阅读 · 2022年6月6日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

VIP会员

文章信息

相关主题

相关VIP内容

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy

Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy

Arxiv

0+阅读 · 2022年6月10日

Meta-Reinforcement Learning with Self-Modifying Networks

Meta-Reinforcement Learning with Self-Modifying Networks

Arxiv

1+阅读 · 2022年6月10日

Mildly Conservative Q-Learning for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月9日

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月9日

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Arxiv

0+阅读 · 2022年6月8日

MIX-MAB: Reinforcement Learning-based Resource Allocation Algorithm for LoRaWAN

Arxiv

0+阅读 · 2022年6月7日

Driving in Real Life with Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月7日

Efficient entity-based reinforcement learning

Arxiv

0+阅读 · 2022年6月6日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

相关基金

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于"Build-and-Click"法的铂类RNA聚合酶I选择性抑制剂的构建、评价及亚细胞定位研究

国家自然科学基金

1+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

粘弹性Oldroyd型流体运动方程长时间行为的快速逼近算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Haccpper环境中不锈钢表面活性与电化学噪声特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

低维纳米尺度金属（Cu、Ni）的腐蚀行为及电化学特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

严重烧伤脓毒症胰腺损伤与糖代谢障碍的关系与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

水下重力梯度辅助惯性导航匹配算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于柔性多体动力学的风力发电传动系统可靠性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员