LULXI: 中国现代诗歌制作制度多样化 (Lingxi: A Diversity-aware Chinese Modern Poetry Generation System) - 专知论文

会员服务 ·

0

拒绝采样 · Notability · INFORMS · 语义相似度 · 样本 ·

2021 年 8 月 27 日

Lingxi: A Diversity-aware Chinese Modern Poetry Generation System

翻译：LULXI: 中国现代诗歌制作制度多样化

Xinran Zhang,Maosong Sun,Jiafeng Liu,Xiaobing Li

Poetry generation has been a difficult task in natural language processing. Unlike plain neural text generation tasks, poetry has a high requirement for novelty, since an easily-understood sentence with too many high frequency words might not be considered as poetic, while adequately ambiguous sentences with low frequency words can possibly be novel and creative. Inspired by this, we present Lingxi, a diversity-aware Chinese modern poetry generation system. We propose nucleus sampling with randomized head (NS-RH) algorithm, which randomizes the high frequency part ("head") of the predicted distribution, in order to emphasize on the "comparatively low frequency" words. The proposed algorithm can significantly increase the novelty of generated poetry compared with traditional sampling methods. The permutation of distribution is controllable by tuning the filtering parameter that determines the "head" to permutate, achieving diversity-aware sampling. We find that even when a large portion of filtered vocabulary is randomized, it can actually generate fluent poetry but with notably higher novelty. We also propose a semantic-similarity-based rejection sampling algorithm, which creates longer and more informative context on the basis of the short input poetry title while maintaining high semantic similarity to the title, alleviating the off-topic issue.

翻译：与普通神经文字生成任务不同,诗歌对于新颖性的要求很高,因为一个容易理解的句子加上太多高频单词可能不会被视为诗意,而使用低频单词的足够模糊的句子则可能是新颖的和创造性的。受此启发,我们向Lingxi展示了一个多样化的中国现代诗歌生成系统。我们建议用随机头(NS-RH)算法进行核心抽样,该算法随机地将预测分布的高频部分(“头”)("头")进行抽查,以强调“相对低频”的词句。提议的算法可以大大增加所产生的诗歌的新颖性,而与传统的抽样方法相比,这种算法可以大大增加。分配的变异性可以通过调整过滤参数加以控制,该参数决定着“头”进行交接,实现多样性觉的采样。我们发现,即使大部分过滤的词汇是随机的,它实际上也能产生流畅的诗歌,但特别高的新奇特。我们还提议用一个基于语类相似的否定式取样算法的词,与传统的抽样算法可以大大地增加,在高位上维持高位。

0

相关内容

拒绝采样

制造业数字化转型路线图,67页pdf

制造业数字化转型路线图,67页pdf

专知会员服务

77+阅读 · 2021年10月11日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【机器学习术语宝典】机器学习中英文术语表

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

专知会员服务

20+阅读 · 2020年5月12日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

语言模型及Word2vec与Bert简析

语言模型及Word2vec与Bert简析

AINLP

6+阅读 · 2020年5月7日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Arxiv

0+阅读 · 2021年10月21日

Improving Compositional Generalization with Self-Training for Data-to-Text Generation

Arxiv

0+阅读 · 2021年10月16日

Unsupervised Natural Language Inference Using PHL Triplet Generation

Arxiv

0+阅读 · 2021年10月16日

Stein Latent Optimization for Generative Adversarial Networks

Arxiv

0+阅读 · 2021年10月15日

Molecular Graph Generation via Geometric Scattering

Arxiv

0+阅读 · 2021年10月12日

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Arxiv

8+阅读 · 2020年3月3日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Diverse Few-Shot Text Classification with Multiple Metrics

Arxiv

6+阅读 · 2018年5月19日

Chinese NER Using Lattice LSTM

Arxiv

14+阅读 · 2018年5月15日

Generating Thematic Chinese Poetry using Conditional Variational Autoencoders with Hybrid Decoders

Arxiv

8+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

语义相似度

相关VIP内容

制造业数字化转型路线图,67页pdf

制造业数字化转型路线图,67页pdf

专知会员服务

77+阅读 · 2021年10月11日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【机器学习术语宝典】机器学习中英文术语表

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

专知会员服务

20+阅读 · 2020年5月12日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

语言模型及Word2vec与Bert简析

语言模型及Word2vec与Bert简析

AINLP

6+阅读 · 2020年5月7日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Arxiv

0+阅读 · 2021年10月21日

Improving Compositional Generalization with Self-Training for Data-to-Text Generation

Arxiv

0+阅读 · 2021年10月16日

Unsupervised Natural Language Inference Using PHL Triplet Generation

Arxiv

0+阅读 · 2021年10月16日

Stein Latent Optimization for Generative Adversarial Networks

Arxiv

0+阅读 · 2021年10月15日

Molecular Graph Generation via Geometric Scattering

Arxiv

0+阅读 · 2021年10月12日

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Arxiv

8+阅读 · 2020年3月3日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Diverse Few-Shot Text Classification with Multiple Metrics

Arxiv

6+阅读 · 2018年5月19日

Chinese NER Using Lattice LSTM

Arxiv

14+阅读 · 2018年5月15日

Generating Thematic Chinese Poetry using Conditional Variational Autoencoders with Hybrid Decoders

Arxiv

8+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员