Saliency maps can explain a neural model's prediction by identifying important input features. While they excel in being faithful to the explained model, saliency maps in their entirety are difficult to interpret for humans, especially for instances with many input features. In contrast, natural language explanations (NLEs) are flexible and can be tuned to a recipient's expectations, but are costly to generate: Rationalization models are usually trained on specific tasks and require high-quality and diverse datasets of human annotations. We combine the advantages from both explainability methods by verbalizing saliency maps. We formalize this underexplored task and propose a novel methodology that addresses two key challenges of this approach -- what and how to verbalize. Our approach utilizes efficient search methods that are task- and model-agnostic and do not require another black-box model, and hand-crafted templates to preserve faithfulness. We conduct a human evaluation of explanation representations across two natural language processing (NLP) tasks: news topic classification and sentiment analysis. Our results suggest that saliency map verbalization makes explanations more understandable and less cognitively challenging to humans than conventional heatmap visualization.
翻译:自然语言解释(NLEs)具有灵活性,可以与接受者的期望相适应,但产生成本很高: 合理化模型通常就特定任务进行培训,需要高质量的和多样化的人类语调数据集。我们通过对突出的地图进行讲解,将两种解释方法的优点结合起来。我们正式确定这一探索不足的任务,并提议一种新颖的方法来应对这一方法的两大挑战 -- -- 即如何和如何言语化。我们的方法使用高效的搜索方法,即任务和模式性,不需要另一种黑盒模型和手工制作的模板来保持忠诚。我们用人类对两种自然语言处理(NLP)任务的解释表达方式进行评估:新闻专题分类和情绪分析。我们的结果表明,突出的言语化使得对人类的解释比常规热映视觉化更易懂,在认知上更不那么具有挑战性。