Recent studies have exploited advanced generative language models to generate Natural Language Explanations (NLE) for why a certain text could be hateful. We propose the Chain of Explanation (CoE) Prompting method, using the target group and retrieved social norms, to generate high-quality NLE for implicit hate speech. Providing accurate target information and high-quality related social norms, we improved the BLUE score from 44.0 to 62.3 for NLE generation. We then evaluate the quality of generated NLE from various automatic metrics and human annotations of informativeness and clarity scores. The correlation analysis between auto-metrics and human perceptions reveals insights into how to select suitable automatic metrics for Natural Language Generation tasks. To showcase a potential application of our proposed CoE method, we demonstrate the f1-score improvements from 0.635 to 0.655 for the implicit hate speech classification task.
翻译:最近的研究利用先进的基因化语言模型来产生自然语言解释(NLE),说明为什么某些文本可能令人憎恶。我们建议使用解释链(CoE)促进方法,利用目标群体和检索的社会规范,为隐含仇恨言论产生高质量的NLE。提供准确的目标信息和高质量的相关社会规范,我们将NLE一代的BLUE分数从44.0提高到62.3。然后,我们评估各种自动指标生成的NLE的质量,以及信息和清晰分数的人类说明。自动测量与人类认知之间的相关分析揭示了如何为天然语言一代的任务选择合适的自动指标。为了展示我们提议的欧委会方法的潜在应用,我们展示了隐含仇恨言论分类任务的F1分数从0.635提高到0.655。