部署后从对话中学习:自食其力,查波特! (Learning from Dialogue after Deployment: Feed Yourself, Chatbot!) - 专知论文

会员服务 ·

0

任务对话系统 · Chatbot · 学成 · 样例 · 估计/估计量 ·

2019 年 1 月 16 日

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

翻译：部署后从对话中学习:自食其力,查波特!

Braden Hancock,Antoine Bordes,Pierre-Emmanuel Mazare,Jason Weston

The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. As our agent engages in conversation, it also estimates user satisfaction in its responses. When the conversation appears to be going well, the user's responses become new training examples to imitate. When the agent believes it has made a mistake, it asks for feedback; learning to predict the feedback that will be given improves the chatbot's dialogue abilities further. On the PersonaChat chit-chat dataset with over 131k training examples, we find that learning from dialogue with a self-feeding chatbot significantly improves performance, regardless of the amount of traditional supervision.

翻译：大部分对话代理器在经过培训和部署后看到其存在寿命的谈话,留下大量潜在的培训信号。在这项工作中,我们推荐了能够从其参与的谈话中提取新的培训实例的对话代理器,即自食其力的聊天机。随着我们的代理器参与对话,它也估计用户对其答复的满意度。当谈话似乎进展顺利时,用户的回复成为模仿的新培训范例。当该代理器认为自己犯了错误时,它要求提供反馈;学习预测将提供的反馈,进一步提高了聊天机的对话能力。在PrimanaChat chit聊天数据集中,有超过131k个培训实例,我们发现从与自食用聊天机的对话中学习会大大改善业绩,而不管传统监督的程度如何。

6

相关内容

任务对话系统

任务对话系统

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

专知会员服务

26+阅读 · 2020年3月27日

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

专知会员服务

20+阅读 · 2020年1月23日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

专知会员服务

32+阅读 · 2019年10月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

专知会员服务

33+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

专知

23+阅读 · 2018年1月30日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

Towards a Human-like Open-Domain Chatbot

Arxiv

14+阅读 · 2020年1月27日

Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems

Arxiv

6+阅读 · 2019年6月11日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models

ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models

Arxiv

3+阅读 · 2018年9月6日

Learning Unsupervised Learning Rules

Arxiv

7+阅读 · 2018年5月23日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

Arxiv

6+阅读 · 2018年4月18日

A Deep Reinforcement Learning Chatbot (Short Version)

Arxiv

13+阅读 · 2018年1月20日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

任务对话系统

估计/估计量

相关VIP内容

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

专知会员服务

26+阅读 · 2020年3月27日

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

专知会员服务

20+阅读 · 2020年1月23日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

专知会员服务

32+阅读 · 2019年10月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

专知会员服务

33+阅读 · 2019年9月20日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

专知

23+阅读 · 2018年1月30日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

相关论文

Towards a Human-like Open-Domain Chatbot

Arxiv

14+阅读 · 2020年1月27日

Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems

Arxiv

6+阅读 · 2019年6月11日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models

ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models

Arxiv

3+阅读 · 2018年9月6日

Learning Unsupervised Learning Rules

Arxiv

7+阅读 · 2018年5月23日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

Arxiv

6+阅读 · 2018年4月18日

A Deep Reinforcement Learning Chatbot (Short Version)

Arxiv

13+阅读 · 2018年1月20日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员