曝光比对自我恢复:扭曲是否真的为自动递减生成文本增量? (Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?) - 专知论文

会员服务 ·

0

曝光偏差 · 有偏 · 强制教学 · MoDELS · Continuity ·

2021 年 8 月 29 日

Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?

翻译：曝光比对自我恢复:扭曲是否真的为自动递减生成文本增量?

Tianxing He,Jingzhao Zhang,Zhiming Zhou,James Glass

Exposure bias has been regarded as a central problem for auto-regressive language models (LM). It claims that teacher forcing would cause the test-time generation to be incrementally distorted due to the training-generation discrepancy. Although a lot of algorithms have been proposed to avoid teacher forcing and therefore alleviate exposure bias, there is little work showing how serious the exposure bias problem actually is. In this work, we focus on the task of open-ended language generation, propose metrics to quantify the impact of exposure bias in the aspects of quality, diversity, and consistency. Our key intuition is that if we feed ground-truth data prefixes (instead of prefixes generated by the model itself) into the model and ask it to continue the generation, the performance should become much better because the training-generation discrepancy in the prefix is removed. Both automatic and human evaluations are conducted in our experiments. On the contrary to the popular belief in exposure bias, we find that the the distortion induced by the prefix discrepancy is limited, and does not seem to be incremental during the generation. Moreover, our analysis reveals an interesting self-recovery ability of the LM, which we hypothesize to be countering the harmful effects from exposure bias.

翻译：接触偏差被认为是自动递减语言模型(LM)的一个中心问题。它声称,教师强迫会使测试时间的生成由于培训产生的差异而逐渐扭曲。虽然提出了许多算法以避免教师强迫,从而减轻接触偏差,但几乎没有什么工作表明接触偏差问题实际上有多严重。在这项工作中,我们把重点放在开放语言生成的任务上,提出量化暴露偏差在质量、多样性和一致性方面的影响的衡量标准。我们的关键直觉是,如果我们向模型提供地面真相数据前缀(而不是模型本身产生的前缀),并要求它继续生成,那么由于消除了前缀中的培训产生的差异,其性能应该好得多。在我们的实验中进行自动和人文评价。与公众对于接触偏差的信念相反,我们发现前缀差异引起的扭曲是有限的,而且一代中似乎没有递增。此外,我们的分析显示,从抵抗LM的有害风险到我们模拟的自我恢复能力。

0

相关内容

曝光偏差

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【哈佛-ICLR2020】基于残差能量模型的文本生成，Residual Energy-Based Models for Text Generation

【哈佛-ICLR2020】基于残差能量模型的文本生成，Residual Energy-Based Models for Text Generation

专知会员服务

11+阅读 · 2020年4月27日

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

【三星AI-CVPR2020】增量小样本目标检测，Incremental Few-Shot Object Detection

专知会员服务

69+阅读 · 2020年3月11日

Gartner：2020年十大战略性技术趋势, 47页pdf

Gartner：2020年十大战略性技术趋势, 47页pdf

专知会员服务

79+阅读 · 2020年3月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Improving Non-autoregressive Generation with Mixup Training

Improving Non-autoregressive Generation with Mixup Training

Arxiv

0+阅读 · 2021年10月21日

Tie-breaker designs provide more efficient kernel estimates than regression discontinuity designs

Arxiv

0+阅读 · 2021年10月20日

GenNI: Human-AI Collaboration for Data-Backed Text Generation

Arxiv

0+阅读 · 2021年10月19日

A General Modeling Framework for Network Autoregressive Processes

Arxiv

0+阅读 · 2021年10月18日

Protecting Anonymous Speech: A Generative Adversarial Network Methodology for Removing Stylistic Indicators in Text

Arxiv

0+阅读 · 2021年10月18日

ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning

Arxiv

0+阅读 · 2021年10月18日

Improving Compositional Generalization with Self-Training for Data-to-Text Generation

Arxiv

0+阅读 · 2021年10月16日

Progressive Pose Attention Transfer for Person Image Generation

Progressive Pose Attention Transfer for Person Image Generation

Arxiv

4+阅读 · 2019年5月13日

Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement

Arxiv

4+阅读 · 2018年2月19日

LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation

Arxiv

3+阅读 · 2017年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【哈佛-ICLR2020】基于残差能量模型的文本生成，Residual Energy-Based Models for Text Generation

【哈佛-ICLR2020】基于残差能量模型的文本生成，Residual Energy-Based Models for Text Generation

专知会员服务

11+阅读 · 2020年4月27日

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

【三星AI-CVPR2020】增量小样本目标检测，Incremental Few-Shot Object Detection

专知会员服务

69+阅读 · 2020年3月11日

Gartner：2020年十大战略性技术趋势, 47页pdf

Gartner：2020年十大战略性技术趋势, 47页pdf

专知会员服务

79+阅读 · 2020年3月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Improving Non-autoregressive Generation with Mixup Training

Improving Non-autoregressive Generation with Mixup Training

Arxiv

0+阅读 · 2021年10月21日

Tie-breaker designs provide more efficient kernel estimates than regression discontinuity designs

Arxiv

0+阅读 · 2021年10月20日

GenNI: Human-AI Collaboration for Data-Backed Text Generation

Arxiv

0+阅读 · 2021年10月19日

A General Modeling Framework for Network Autoregressive Processes

Arxiv

0+阅读 · 2021年10月18日

Protecting Anonymous Speech: A Generative Adversarial Network Methodology for Removing Stylistic Indicators in Text

Arxiv

0+阅读 · 2021年10月18日

ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning

Arxiv

0+阅读 · 2021年10月18日

Improving Compositional Generalization with Self-Training for Data-to-Text Generation

Arxiv

0+阅读 · 2021年10月16日

Progressive Pose Attention Transfer for Person Image Generation

Progressive Pose Attention Transfer for Person Image Generation

Arxiv

4+阅读 · 2019年5月13日

Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement

Arxiv

4+阅读 · 2018年2月19日

LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation

Arxiv

3+阅读 · 2017年8月2日

微信扫码咨询专知VIP会员