互动信息检索中平衡加强学习培训经验 (Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval) - 专知论文

会员服务 ·

0

INTERACT · INFORMS · 学成 · 信息检索 · 强化学习 ·

2021 年 6 月 9 日

Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval

翻译：互动信息检索中平衡加强学习培训经验

Limin Chen,Zhiwen Tang,Grace Hui Yang

from arxiv, Accepted by SIGIR 2020

Interactive Information Retrieval (IIR) and Reinforcement Learning (RL) share many commonalities, including an agent who learns while interacts, a long-term and complex goal, and an algorithm that explores and adapts. To successfully apply RL methods to IIR, one challenge is to obtain sufficient relevance labels to train the RL agents, which are infamously known as sample inefficient. However, in a text corpus annotated for a given query, it is not the relevant documents but the irrelevant documents that predominate. This would cause very unbalanced training experiences for the agent and prevent it from learning any policy that is effective. Our paper addresses this issue by using domain randomization to synthesize more relevant documents for the training. Our experimental results on the Text REtrieval Conference (TREC) Dynamic Domain (DD) 2017 Track show that the proposed method is able to boost an RL agent's learning effectiveness by 22\% in dealing with unseen situations.

翻译：互动信息检索(IR)和强化学习(RL)有许多共同点,包括一个在互动中学习的代理商,一个长期和复杂的目标,以及一个探索和调整的算法。要成功地将RL方法应用到IR,一个挑战是获得足够的相关标签来培训RL代理商,这些代理商被臭名昭著地称为抽样效率低下。然而,在给某个查询附加说明的文本材料中,不是相关文件,而是无关的文件。这将给代理商带来非常不平衡的培训经验,使其无法学习任何有效的政策。我们的文件通过使用域随机化来解决这一问题,为培训合成更相关的文件。我们在2017年文本检索(TRE)会议动态域(D)轨道上的实验结果显示,拟议的方法能够提高RL代理商在处理不可见情形方面的学习效率,途径是22 ⁇ 。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【新书】深度学习搜索，Deep Learning for Search，附327页pdf

【新书】深度学习搜索，Deep Learning for Search，附327页pdf

专知会员服务

213+阅读 · 2020年1月13日

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

专知会员服务

26+阅读 · 2019年12月25日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

专知会员服务

46+阅读 · 2019年12月2日

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

专知会员服务

6+阅读 · 2019年11月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

专知

7+阅读 · 2018年2月14日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

Arxiv

0+阅读 · 2021年8月4日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Arxiv

3+阅读 · 2018年9月27日

Multi-task Deep Reinforcement Learning with PopArt

Multi-task Deep Reinforcement Learning with PopArt

Arxiv

4+阅读 · 2018年9月12日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Do deep reinforcement learning agents model intentions?

Arxiv

5+阅读 · 2018年5月21日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Fast Interactive Image Retrieval using large-scale unlabeled data

Arxiv

4+阅读 · 2018年2月12日

VIP会员

文章信息

相关主题

相关VIP内容

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【新书】深度学习搜索，Deep Learning for Search，附327页pdf

【新书】深度学习搜索，Deep Learning for Search，附327页pdf

专知会员服务

213+阅读 · 2020年1月13日

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

专知会员服务

26+阅读 · 2019年12月25日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

专知会员服务

46+阅读 · 2019年12月2日

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

专知会员服务

6+阅读 · 2019年11月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

专知

7+阅读 · 2018年2月14日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

Arxiv

0+阅读 · 2021年8月4日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Arxiv

3+阅读 · 2018年9月27日

Multi-task Deep Reinforcement Learning with PopArt

Multi-task Deep Reinforcement Learning with PopArt

Arxiv

4+阅读 · 2018年9月12日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Do deep reinforcement learning agents model intentions?

Arxiv

5+阅读 · 2018年5月21日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Fast Interactive Image Retrieval using large-scale unlabeled data

Arxiv

4+阅读 · 2018年2月12日

微信扫码咨询专知VIP会员