BYGPT5: 配有无声调语言模式的端到端样式附加条件的诗类生成 (ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models)

State-of-the-art poetry generation systems are often complex. They either consist of task-specific model pipelines, incorporate prior knowledge in the form of manually created constraints or both. In contrast, end-to-end models would not suffer from the overhead of having to model prior knowledge and could learn the nuances of poetry from data alone, reducing the degree of human supervision required. In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. We identify and address lack of training data and mismatching tokenization algorithms as possible limitations of past attempts. In particular, we successfully pre-train and release ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with our styles. We show that ByGPT5 outperforms other models such as mT5, ByT5, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans. In addition, we analyze its runtime performance and introspect the model's understanding of style conditions. We make our code, models, and datasets publicly available.

翻译：最先进的诗歌生成系统往往非常复杂,它们要么由特定任务的模式管道组成,以人工创造的限制形式或两者兼而有之,吸收先前的知识。相反,端到端模式不会因为必须先学先学知识而受到影响,而只从数据中学习诗歌的细微差别,从而降低人类所需的监督程度。在这项工作中,我们调查以诸如押韵、仪表和通俗等风格为条件的端到端的诗生成。我们发现并解决缺乏培训数据和代谢算法的问题,作为过去尝试的可能限制。特别是,我们成功地预演和发布GPT5,一种新的无符号解码器只用的语言模型,并微调它放在一个庞大的自定义的英语和德语四边文中,用我们的风格作注释。我们显示,GPT5优于MT5、BY5、GPT2和ChattGPT等其他模型,同时提高参数的效率,并比人类更优异。此外,我们分析了它的运行时间性表现和内向式模型,我们分析了我们现有的数据样式。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/