The evaluation of explanation methods is a research topic that has not yet been explored deeply, however, since explainability is supposed to strengthen trust in artificial intelligence, it is necessary to systematically review and compare explanation methods in order to confirm their correctness. Until now, no tool with focus on XAI evaluation exists that exhaustively and speedily allows researchers to evaluate the performance of explanations of neural network predictions. To increase transparency and reproducibility in the field, we therefore built Quantus -- a comprehensive, evaluation toolkit in Python that includes a growing, well-organised collection of evaluation metrics and tutorials for evaluating explainable methods. The toolkit has been thoroughly tested and is available under an open-source license on PyPi (or on https://github.com/understandable-machine-intelligence-lab/Quantus/).
翻译:神经网络解释方法的评估是一个尚未深入探讨的研究课题,然而,由于可解释性应该增强人们对人工智能的信任,因此有必要系统地审查和比较解释方法,以确认其正确性。到目前为止,尚不存在专注于解释评估的工具,可以全面、快速地让研究人员评估神经网络预测的解释表现。为了增加该领域的透明度和可重复性,我们因此构建了Quantus——一种全面的Python评估工具包,包括日益增长且组织良好的解释方法评估指标的集合以及用于评估可解释的方法的教程。该工具包经过彻底测试,并可在PyPi(或https://github.com/understandable-machine-intelligence-lab/Quantus/)上以开放源代码许可证提供。