雪貂:关于变压器解释基准框架</s> (ferret: a Framework for Benchmarking Explainers on Transformers)

As Transformers are increasingly relied upon to solve complex NLP problems, there is an increased need for their decisions to be humanly interpretable. While several explainable AI (XAI) techniques for interpreting the outputs of transformer-based models have been proposed, there is still a lack of easy access to using and comparing them. We introduce ferret, a Python library to simplify the use and comparisons of XAI methods on transformer-based classifiers. With ferret, users can visualize and compare transformers-based models output explanations using state-of-the-art XAI methods on any free-text or existing XAI corpora. Moreover, users can also evaluate ad-hoc XAI metrics to select the most faithful and plausible explanations. To align with the recently consolidated process of sharing and using transformers-based models from Hugging Face, ferret interfaces directly with its Python library. In this paper, we showcase ferret to benchmark XAI methods used on transformers for sentiment analysis and hate speech detection. We show how specific methods provide consistently better explanations and are preferable in the context of transformer models.

翻译：由于变异器日益被依赖来解决复杂的NLP问题,因此越来越需要对其决定进行人文解释。虽然已经提出了几种解释变异器模型输出结果的可解释的AI(XAI)技术,但仍然缺乏使用和比较这些模型的容易途径。我们引入了雪貂,一个Python图书馆,以简化变异器分类器对XAI方法的使用和比较。有了雪貂,用户可以视觉化和比较以变异器为基础的模型输出解释,使用最先进的XAI方法对任何自由文本或现有的XAI公司进行解释。此外,用户还可以评估A-hoc XAI衡量标准,以选择最可信和可信的解释。为了与最近合并的分享和使用Hugging Face的变异器模型的过程保持一致,我们引入了直接与Python图书馆的界面。在本文中,我们展示了用于感知感应和仇恨言论检测变异器使用的XAI方法的基准。我们展示了具体方法如何提供一贯更好的解释,在变异器模型中更可取。</s>