Deep neural networks are powerful statistical learners. However, their predictions do not come with an explanation of their process. To analyze these models, explanation methods are being developed. We present a novel explanation method, called OLM, for natural language processing classifiers. This method combines occlusion and language modeling, which are techniques central to explainability and NLP, respectively. OLM gives explanations that are theoretically sound and easy to understand. We make several contributions to the theory of explanation methods. Axioms for explanation methods are an interesting theoretical concept to explore their basics and deduce methods. We introduce a new axiom, give its intuition and show it contradicts another existing axiom. Additionally, we point out theoretical difficulties of existing gradient-based and some occlusion-based explanation methods in natural language processing. We provide an extensive argument why evaluation of explanation methods is difficult. We compare OLM to other explanation methods and underline its uniqueness experimentally. Finally, we investigate corner cases of OLM and discuss its validity and possible improvements.
翻译:深心神经网络是强大的统计学习者。 但是,它们的预测并不包含对其过程的解释。 为了分析这些模型, 正在开发解释方法。 我们为自然语言处理分类提供了一种叫做 OLM 的新的解释方法。 这个方法将隐蔽和语言模型结合起来, 它们是解释性和NLP 的核心技术。 OLM 给出了理论上合理和容易理解的解释。 我们为解释方法理论做出了一些贡献。 解释方法的理论原理是探索其基础和推论方法的有趣理论概念。 我们引入了新的轴心, 给出其直觉, 并表明它与另一种现有轴心相矛盾。 此外, 我们指出了在自然语言处理中现有的梯度解释方法和某些隐蔽解释方法的理论困难。 我们提供了广泛的论据,说明解释方法的评估在理论上很困难。 我们将 OLM 与其他解释方法进行比较, 并用实验性强调其独特性。 最后, 我们调查OLM 的角落案例, 并讨论其有效性和可能的改进。