后见:对回收者进行内向性指导培训,以改善无限制的一代人 (Hindsight: Posterior-guided training of retrievers for improved open-ended generation) - 专知论文

会员服务 ·

0

INFORMS · 证据下界 · 输出 · 维基百科 · Better ·

2021 年 10 月 14 日

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

翻译：后见:对回收者进行内向性指导培训,以改善无限制的一代人

Ashwin Paranjape,Omar Khattab,Christopher Potts,Matei Zaharia,Christopher D. Manning

Many text generation systems benefit from using a retriever to retrieve passages from a textual knowledge corpus (e.g., Wikipedia) which are then provided as additional context to the generator. For open-ended generation tasks (like generating informative utterances in conversations) many varied passages may be equally relevant and we find that existing methods that jointly train the retriever and generator underperform: the retriever may not find relevant passages even amongst the top-10 and hence the generator may not learn a preference to ground its generated output in them. We propose using an additional guide retriever that is allowed to use the target output and "in hindsight" retrieve relevant passages during training. We model the guide retriever after the posterior distribution Q of passages given the input and the target output and train it jointly with the standard retriever and the generator by maximizing the evidence lower bound (ELBo) in expectation over Q. For informative conversations from the Wizard of Wikipedia dataset, with posterior-guided training, the retriever finds passages with higher relevance in the top-10 (23% relative improvement), the generator's responses are more grounded in the retrieved passage (19% relative improvement) and the end-to-end system produces better overall output (6.4% relative improvement).

翻译：许多文本生成系统都受益于使用检索器从文本知识堆(例如维基百科)中检索段落,然后作为附加的上下文提供给生成器。对于开放式生成任务(比如在谈话中产生信息化的话语),许多不同段落可能具有同等的相关性,我们发现,联合培训检索器和生成器不完善的现有方法可能具有同等的相关性:检索器可能甚至在前10层中找不到相关通道,因此,生成器可能不会从中学习偏好于将生成的输出置于其中。我们提议使用额外的指南检索器,允许在培训期间使用目标输出和“在后视”检索相关段落。在输入和目标输出的后端分配 Q 之后,我们将指南检索器建模,与标准检索器和生成器联合培训:检索器可能甚至在前10层中找不到相关通道,因此,生成器可能无法从维基百科数据集向导师那里获取信息性谈话,并经过后导培训后,检索器会发现在前10层(23 % 相对改进) 上找到更高相关性的通道。发电机的反应(19 % 相对改进后路段) 将更接近全面改进。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

专知会员服务

66+阅读 · 2020年8月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【Facebook AI】低资源机器翻译，74页ppt

【Facebook AI】低资源机器翻译，74页ppt

专知会员服务

30+阅读 · 2020年4月8日

【清华大学】知识增强的常识性故事生成预训练模型，A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

【清华大学】知识增强的常识性故事生成预训练模型，A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

专知会员服务

52+阅读 · 2020年1月20日

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

专知会员服务

10+阅读 · 2019年12月3日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

已删除

将门创投

6+阅读 · 2019年9月3日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Arxiv

0+阅读 · 2021年12月6日

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

Arxiv

1+阅读 · 2021年12月4日

Modeling question asking using neural program generation

Arxiv

4+阅读 · 2019年9月26日

Question Generation by Transformers

Question Generation by Transformers

Arxiv

5+阅读 · 2019年9月14日

Jointly Optimizing Diversity and Relevance in Neural Response Generation

Arxiv

4+阅读 · 2019年2月28日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

Joint Image Captioning and Question Answering

Arxiv

6+阅读 · 2018年5月22日

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

Arxiv

8+阅读 · 2018年3月29日

Discriminability objective for training descriptive captions

Arxiv

9+阅读 · 2018年3月12日

Scale Up Event Extraction Learning via Automatic Training Data Generation

Arxiv

7+阅读 · 2017年12月11日

VIP会员

文章信息

相关主题

相关VIP内容

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

专知会员服务

66+阅读 · 2020年8月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【Facebook AI】低资源机器翻译，74页ppt

【Facebook AI】低资源机器翻译，74页ppt

专知会员服务

30+阅读 · 2020年4月8日

【清华大学】知识增强的常识性故事生成预训练模型，A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

【清华大学】知识增强的常识性故事生成预训练模型，A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

专知会员服务

52+阅读 · 2020年1月20日

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

专知会员服务

10+阅读 · 2019年12月3日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

已删除

将门创投

6+阅读 · 2019年9月3日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Arxiv

0+阅读 · 2021年12月6日

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

Arxiv

1+阅读 · 2021年12月4日

Modeling question asking using neural program generation

Arxiv

4+阅读 · 2019年9月26日

Question Generation by Transformers

Question Generation by Transformers

Arxiv

5+阅读 · 2019年9月14日

Jointly Optimizing Diversity and Relevance in Neural Response Generation

Arxiv

4+阅读 · 2019年2月28日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

Joint Image Captioning and Question Answering

Arxiv

6+阅读 · 2018年5月22日

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

Arxiv

8+阅读 · 2018年3月29日

Discriminability objective for training descriptive captions

Arxiv

9+阅读 · 2018年3月12日

Scale Up Event Extraction Learning via Automatic Training Data Generation

Arxiv

7+阅读 · 2017年12月11日

微信扫码咨询专知VIP会员