Style plays a significant role in how humans express themselves and communicate with others. Large pre-trained language models produce impressive results on various style classification tasks. However, they often learn spurious domain-specific words to make predictions. This incorrect word importance learned by the model often leads to ambiguous token-level explanations which do not align with human perception of linguistic styles. To tackle this challenge, we introduce StyLEx, a model that learns annotated human perceptions of stylistic lexica and uses these stylistic words as additional information for predicting the style of a sentence. Our experiments show that StyLEx can provide human-like stylistic lexical explanations without sacrificing the performance of sentence-level style prediction on both original and out-of-domain datasets. Explanations from StyLEx show higher sufficiency, and plausibility when compared to human annotations, and are also more understandable by human judges compared to the existing widely-used saliency baseline.
翻译:样式在人类如何表达自己和与他人交流方面起着重要作用。 大型经过培训的语言模型在各种风格分类任务上产生了令人印象深刻的结果。 但是,它们常常学习虚假的域名来作出预测。 模型学得的字的重要性不正确,往往导致与人类对语言风格的看法不相符的象征性解释。 为了应对这一挑战,我们引入了StyLEx, 这个模型可以学习人类对文体法的附加说明,并且使用这些文体字作为补充信息来预测句子的风格。 我们的实验显示, StyLEx 可以在不牺牲对原版和外版数据集进行判决式预测的情况下,提供像人类一样的文体法解释。 StyLEx 的解释显示,与人的语义描述相比,更充分,更可信,而且与现有的广泛使用的显著基线相比,人类法官更容易理解。