Large pre-trained neural language models have supported the effectiveness of many NLP tasks, yet are still prone to generating toxic language hindering the safety of their use. Using empathetic data, we improve over recent work on controllable text generation that aims to reduce the toxicity of generated text. We find we are able to dramatically reduce the size of fine-tuning data to 7.5-30k samples while at the same time making significant improvements over state-of-the-art toxicity mitigation of up to 3.4% absolute reduction (26% relative) from the original work on 2.3m samples, by strategically sampling data based on empathy scores. We observe that the degree of improvement is subject to specific communication components of empathy. In particular, the cognitive components of empathy significantly beat the original dataset in almost all experiments, while emotional empathy was tied to less improvement and even underperforming random samples of the original data. This is a particularly implicative insight for NLP work concerning empathy as until recently the research and resources built for it have exclusively considered empathy as an emotional concept.
翻译:经过培训的大型神经语言模型支持了许多NLP任务的有效性,但是仍然容易产生妨碍其使用安全的有毒语言。使用同情性数据,我们改进了最近关于可控文本生成的工作,目的是降低生成文本的毒性。我们发现,我们能够将微调数据的规模大幅降低到7.5-30k样本,同时通过基于共感分数的战略抽样数据,大大改进了对2.3m样本的原始工作进行的最高3.4%绝对减少(相对为26%)的最先进的毒性缓解工作。我们观察到,改进的程度取决于特定通信的共感组成部分。特别是,共感的认知组成部分大大超过几乎所有实验中的原始数据集,而情感共感与原始数据的改进程度不那么紧密,甚至表现不力随机样本也差。对于NLP关于同情的研究和为它建立的资源直到最近一直完全将同情视为情感概念,这是对NLP关于近似于同情的研究和资源的特别重要的洞察力。