自然语言生成的典型解码 (Typical Decoding for Natural Language Generation) - 专知论文

会员服务 ·

0

INFORMS · 可约的 · 香农 · Perplexity · MoDELS ·

2022 年 2 月 10 日

Typical Decoding for Natural Language Generation

翻译：自然语言生成的典型解码

Clara Meister,Tiago Pimentel,Gian Wiher,Ryan Cotterell

Despite achieving incredibly low perplexities on myriad natural language corpora, today's language models still often underperform when used to generate text. This dichotomy has puzzled the language generation community for the last few years. In this work, we posit that the abstraction of natural language as a communication channel (\`a la Shannon, 1948) can provide new insights into the behaviors of probabilistic language generators, e.g., why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, and do so in an efficient yet error-minimizing manner, choosing each word in a string with this (perhaps subconscious) goal in mind. We propose that generation from probabilistic models should mimic this behavior. Rather than always choosing words from the high-probability region of the distribution--which have a low Shannon information content--we sample from the set of words with an information content close to its expected value, i.e., close to the conditional entropy of our model. This decision criterion can be realized through a simple and efficient implementation, which we call typical sampling. Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, typical sampling offers competitive performance in terms of quality while consistently reducing the number of degenerate repetitions.

翻译：尽管在众多自然语言中实现了极低的迷惑,但今天的语言模式在用于生成文本时仍然往往表现不佳。这种二分法在过去几年里使语言生成社区感到困惑。在这项工作中,我们假设自然语言作为一个交流渠道的抽象化( ⁇ a la Shannon,1948)能够对概率语言生成者的行为提供新的洞察力,例如,为什么高概率文本可能会枯燥或重复。人类使用语言作为信息传递手段,并以高效但最差错的方式这样做,在其中选择一个字符串中的每一词,用这个(可能潜意识下)的目标来计算。我们建议,从概率模型中生成的版本应该模仿这种行为。我们建议,从高概率语言分布区( ⁇ a la la Shannon, 1948) 来选择语言,而不是总是从高概率语言生成的词,因为高频信息内容与预期值接近,即接近于我们模型的有条件的英特质的英特罗比。可以通过简单而高效的实施来实现这一决定标准, 可以通过一个简单而高效的字符串(我们称之为典型的典型的抽样、 ) 和高压性核心评估来显示最典型的样品的复制。

1

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Co基过渡金属合金团簇的结构和磁性理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于人群密度的虚拟人群行为建模及仿真技术研究

国家自然科学基金

3+阅读 · 2013年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

自组装制备诊疗一体化聚合物囊泡

国家自然科学基金

0+阅读 · 2012年12月31日

基于非凸目标函数的稀疏学习及其在医疗诊断中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

图的若干参数及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

关于图顶点划分的 Thomassen 猜想

国家自然科学基金

0+阅读 · 2011年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

乳腺癌化疗所致记忆障碍的脑机制及其康复的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Sound-Guided Semantic Video Generation

Arxiv

0+阅读 · 2022年4月20日

Opal: Multimodal Image Generation for News Illustration

Arxiv

0+阅读 · 2022年4月19日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Arxiv

0+阅读 · 2022年4月15日

Efficient Architecture Search for Diverse Tasks

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

A Survey of Natural Language Generation

Arxiv

15+阅读 · 2021年12月22日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Sound-Guided Semantic Video Generation

Arxiv

0+阅读 · 2022年4月20日

Opal: Multimodal Image Generation for News Illustration

Arxiv

0+阅读 · 2022年4月19日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Arxiv

0+阅读 · 2022年4月15日

Efficient Architecture Search for Diverse Tasks

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

A Survey of Natural Language Generation

Arxiv

15+阅读 · 2021年12月22日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

相关基金

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Co基过渡金属合金团簇的结构和磁性理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于人群密度的虚拟人群行为建模及仿真技术研究

国家自然科学基金

3+阅读 · 2013年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

自组装制备诊疗一体化聚合物囊泡

国家自然科学基金

0+阅读 · 2012年12月31日

基于非凸目标函数的稀疏学习及其在医疗诊断中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

图的若干参数及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

关于图顶点划分的 Thomassen 猜想

国家自然科学基金

0+阅读 · 2011年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

乳腺癌化疗所致记忆障碍的脑机制及其康复的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员