可处理的自回归语言生成控制 (Tractable Control for Autoregressive Language Generation) - 专知论文

会员服务 ·

0

约束 · 文本生成 · 语言生成 · 概率模型 · 复杂约束 ·

2023 年 4 月 18 日

Tractable Control for Autoregressive Language Generation

翻译：可处理的自回归语言生成控制

Honghua Zhang,Meihua Dang,Nanyun Peng,Guy Van den Broeck

from arxiv, fixed typo in Table 1

Despite the success of autoregressive large language models in text generation, it remains a major challenge to generate text that satisfies complex constraints: sampling from the conditional distribution $\Pr(\text{text} | \alpha)$ is intractable for even the simplest lexical constraints $\alpha$. To overcome this challenge, we propose to use tractable probabilistic models to impose lexical constraints in autoregressive text generation, which we refer to as GeLaTo. To demonstrate the effectiveness of this framework, we use distilled hidden Markov models to control autoregressive generation from GPT2. GeLaTo achieves state-of-the-art performance on CommonGen, a challenging benchmark for constrained text generation, beating a wide range of strong baselines by a large margin. Our work not only opens up new avenues for controlling large language models but also motivates the development of more expressive tractable probabilistic models.

翻译：尽管自回归大语言模型在文本生成方面取得了成功，但生成满足复杂约束的文本仍然是一个巨大的挑战：即使是最简单的词汇约束α，从条件分布$\Pr(\text{text} | \alpha)$中采样也是不可处理的。为了克服这一挑战，我们提出使用可处理的概率模型在自回归文本生成中实施词汇约束，我们将其称为GeLaTo。为了证明这个框架的有效性，我们使用蒸馏的隐马尔可夫模型来控制从GPT2生成的自回归。GeLaTo在CommonGen上实现了最先进的性能，CommonGen是一个具有挑战性的受约束文本生成基准，与一系列强基线相比，显著提高了效果。我们的工作不仅为控制大的语言模型开辟了新的途径，也促进了更富表现力的可处理概率模型的发展。

0

相关内容

大模型全面阐述，448页新书《基础模型自然语言处理》，详述大模型在信息提取文本生成视觉语音应用

大模型全面阐述，448页新书《基础模型自然语言处理》，详述大模型在信息提取文本生成视觉语音应用

专知会员服务

180+阅读 · 2023年5月27日

【ICML2023】基于自然语言指令的受控文本生成

【ICML2023】基于自然语言指令的受控文本生成

专知会员服务

29+阅读 · 2023年4月28日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【ICML2020】统一预训练伪掩码语言模型

【ICML2020】统一预训练伪掩码语言模型

专知会员服务

27+阅读 · 2020年7月23日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

专知

2+阅读 · 2022年6月3日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高维稀疏统计模型中的变量选择与检验

国家自然科学基金

1+阅读 · 2014年12月31日

单个中性原子的操控与精密测量

国家自然科学基金

0+阅读 · 2013年12月31日

非线性系统优化控制的数值解法统一框架及滑模后退时域控制算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

浸入边界法的高效稳定数值格式

国家自然科学基金

0+阅读 · 2012年12月31日

稳健且有效的回归和变量选择方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

稀疏性保持的降维技术及其拓展研究

国家自然科学基金

0+阅读 · 2009年12月31日

改良的离散事件系统最优化控制理论及应用

国家自然科学基金

0+阅读 · 2008年12月31日

Self-Edit: Fault-Aware Code Editor for Code Generation

Self-Edit: Fault-Aware Code Editor for Code Generation

Arxiv

0+阅读 · 2023年6月5日

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

Arxiv

0+阅读 · 2023年6月2日

StyleDrop: Text-to-Image Generation in Any Style

Arxiv

0+阅读 · 2023年6月1日

Preference-grounded Token-level Guidance for Language Model Fine-tuning

Arxiv

0+阅读 · 2023年6月1日

Level Generation Through Large Language Models

Arxiv

0+阅读 · 2023年6月1日

SQL-PaLM: Improved Large Language ModelAdaptation for Text-to-SQL

Arxiv

0+阅读 · 2023年5月26日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

相关VIP内容

大模型全面阐述，448页新书《基础模型自然语言处理》，详述大模型在信息提取文本生成视觉语音应用

大模型全面阐述，448页新书《基础模型自然语言处理》，详述大模型在信息提取文本生成视觉语音应用

专知会员服务

180+阅读 · 2023年5月27日

【ICML2023】基于自然语言指令的受控文本生成

【ICML2023】基于自然语言指令的受控文本生成

专知会员服务

29+阅读 · 2023年4月28日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【ICML2020】统一预训练伪掩码语言模型

【ICML2020】统一预训练伪掩码语言模型

专知会员服务

27+阅读 · 2020年7月23日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

专知

2+阅读 · 2022年6月3日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

Self-Edit: Fault-Aware Code Editor for Code Generation

Self-Edit: Fault-Aware Code Editor for Code Generation

Arxiv

0+阅读 · 2023年6月5日

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

Arxiv

0+阅读 · 2023年6月2日

StyleDrop: Text-to-Image Generation in Any Style

Arxiv

0+阅读 · 2023年6月1日

Preference-grounded Token-level Guidance for Language Model Fine-tuning

Arxiv

0+阅读 · 2023年6月1日

Level Generation Through Large Language Models

Arxiv

0+阅读 · 2023年6月1日

SQL-PaLM: Improved Large Language ModelAdaptation for Text-to-SQL

Arxiv

0+阅读 · 2023年5月26日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高维稀疏统计模型中的变量选择与检验

国家自然科学基金

1+阅读 · 2014年12月31日

单个中性原子的操控与精密测量

国家自然科学基金

0+阅读 · 2013年12月31日

非线性系统优化控制的数值解法统一框架及滑模后退时域控制算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

浸入边界法的高效稳定数值格式

国家自然科学基金

0+阅读 · 2012年12月31日

稳健且有效的回归和变量选择方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

稀疏性保持的降维技术及其拓展研究

国家自然科学基金

0+阅读 · 2009年12月31日

改良的离散事件系统最优化控制理论及应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员