Recently, Logic Explained Networks (LENs) have been proposed as explainable-by-design neural models providing logic explanations for their predictions. However, these models have only been applied to vision and tabular data, and they mostly favour the generation of global explanations, while local ones tend to be noisy and verbose. For these reasons, we propose LENp, improving local explanations by perturbing input words, and we test it on text classification. Our results show that (i) LENp provides better local explanations than LIME in terms of sensitivity and faithfulness, and (ii) logic explanations are more useful and user-friendly than feature scoring provided by LIME as attested by a human survey.
翻译:最近,逻辑解释网络(LENs)被提议为逐个设计神经模型,为预测提供逻辑解释;然而,这些模型仅用于愿景和表格数据,主要倾向于产生全球解释,而当地解释往往吵闹杂杂杂。出于这些原因,我们提议LENp,通过干扰输入词来改进当地解释,并测试文字分类。我们的结果表明:(一) LENp在敏感度和忠诚性方面比LIME提供更好的当地解释,以及(二)逻辑解释比LIME通过人类调查证明的特征评分更有用、更方便用户。