Recent studies have exploited advanced generative language models to generate Natural Language Explanations (NLE) for why a certain text could be hateful. We propose the Chain of Explanation (CoE) Prompting method, using the heuristic words and target group, to generate high-quality NLE for implicit hate speech. We improved the BLUE score from 44.0 to 62.3 for NLE generation by providing accurate target information. We then evaluate the quality of generated NLE using various automatic metrics and human annotations of informativeness and clarity scores.
翻译:最近的研究利用先进的基因变异语言模型来产生自然语言解释(NLE),说明为什么某一文本可能令人憎恶。我们建议采用解释链(CoE)催化法,使用累赘词和目标群体,为隐含仇恨言论创造高质量的NLE。我们通过提供准确的目标信息,将NLE的BLUE分数从44.0提高到62.3。然后,我们使用各种自动指标和人文信息清晰度分数说明来评估生成NLE的质量。</s>