Detoxification is a task of generating text in polite style while preserving meaning and fluency of the original toxic text. Existing detoxification methods are designed to work in one exact language. This work investigates multilingual and cross-lingual detoxification and the behavior of large multilingual models like in this setting. Unlike previous works we aim to make large language models able to perform detoxification without direct fine-tuning in given language. Experiments show that multilingual models are capable of performing multilingual style transfer. However, models are not able to perform cross-lingual detoxification and direct fine-tuning on exact language is inevitable.
翻译:解毒是一项以礼貌方式生成文本的任务,同时保留原有毒文本的含义和流畅性。现有的解毒方法设计成一种精确的语言。这项工作调查多种语言和跨语言解毒以及类似此背景下大型多语言模式的行为。与以往的工作不同,我们的目标是使大型语言模式能够在不直接微调的情况下对特定语言进行解毒,实验表明多语言模式能够进行多语言风格转移。然而,模型无法进行跨语言解毒和对准确语言直接微调,这是不可避免的。