Modern NLP defines the task of style transfer as modifying the style of a given sentence without appreciably changing its semantics, which implies that the outputs of style transfer systems should be paraphrases of their inputs. However, many existing systems purportedly designed for style transfer inherently warp the input's meaning through attribute transfer, which changes semantic properties such as sentiment. In this paper, we reformulate unsupervised style transfer as a paraphrase generation problem, and present a simple methodology based on fine-tuning pretrained language models on automatically generated paraphrase data. Despite its simplicity, our method significantly outperforms state-of-the-art style transfer systems on both human and automatic evaluations. We also survey 23 style transfer papers and discover that existing automatic metrics can be easily gamed and propose fixed variants. Finally, we pivot to a more real-world style transfer setting by collecting a large dataset of 15M sentences in 11 diverse styles, which we use for an in-depth analysis of our system.
翻译:现代 NLP 将风格传输的任务定义为修改给定句的风格,而不会明显改变其语义,这意味着风格传输系统的输出应该是其输入的引言。然而,许多现有系统据称是用来通过属性传输而使输入的内涵扭曲,从而改变语义属性的特性,例如情绪。在本文中,我们重新将不受监督的风格传输作为一个引言生成问题,并提出了一个基于对自动生成的参数数据进行微调的预先培训语言模型的简单方法。尽管我们的方法很简单,但大大优于人类和自动评价的艺术风格传输系统。我们还调查了23种风格传输文件,发现现有的自动计量系统可以很容易地进行游戏,并提出了固定的变式。最后,我们通过收集11种不同样式的15M大数据集,将我们用来深入分析我们的系统。