While there has been a recent explosion of work on ExplainableAI ExAI on deep models that operate on imagery and tabular data, textual datasets present new challenges to the ExAI community. Such challenges can be attributed to the lack of input structure in textual data, the use of word embeddings that add to the opacity of the models and the difficulty of the visualization of the inner workings of deep models when they are trained on textual data. Lately, methods have been developed to address the aforementioned challenges and present satisfactory explanations on Natural Language Processing (NLP) models. However, such methods are yet to be studied in a comprehensive framework where common challenges are properly stated and rigorous evaluation practices and metrics are proposed. Motivated to democratize ExAI methods in the NLP field, we present in this work a survey that studies model-agnostic as well as model-specific explainability methods on NLP models. Such methods can either develop inherently interpretable NLP models or operate on pre-trained models in a post-hoc manner. We make this distinction and we further decompose the methods into three categories according to what they explain: (1) word embeddings (input-level), (2) inner workings of NLP models (processing-level) and (3) models' decisions (output-level). We also detail the different evaluation approaches interpretability methods in the NLP field. Finally, we present a case-study on the well-known neural machine translation in an appendix and we propose promising future research directions for ExAI in the NLP field.
翻译:虽然最近关于可解释的AI ExAI关于图像和表格数据操作的深层模型的工作迅速展开,但文本数据集给ExAI社区带来了新的挑战,这些挑战可归因于:文本数据缺乏输入结构,使用增加模型不透明性的字嵌入式,以及深层模型在接受文本数据培训时难以直观其内在功能;最近,已经制定了方法,以应对上述挑战,并对自然语言处理(NLP)模型提出令人满意的解释;然而,尚未在全面框架内研究这些方法,因为共同的挑战得到了适当的说明,并提出了严格的评价做法和衡量标准;为了在NLP领域实现ExAI方法民主化,我们在此工作中提出一项调查,研究模型的认知性以及深层模型的具体解释方法;这些方法要么可以开发内在可解释的NLP模型,或者以选用后L(NLP)模型的预知性模型。我们作出这种区分,并将这些方法进一步细分为三个类别,其中提出了共同方向,并提出了严格的评价做法和衡量NLP领域(我们提出在最终的翻译方法中) 和(我们提出) 实地解释的实地分析方法,最后解释(我们提出) 实地分析的实地分析(我们提出) 和实地分析方法)。