Providing explanations for deep neural networks (DNNs) is essential for their use in domains wherein the interpretability of decisions is a critical prerequisite. Despite the plethora of work on interpreting DNNs, most existing solutions offer interpretability in an ad hoc, one-shot, and static manner, without accounting for the perception, understanding, or response of end-users, resulting in their poor usability in practice. In this paper, we argue that DNN interpretability should be implemented as the interactions between users and models. We present i-Algebra, a first-of-its-kind interactive framework for interpreting DNNs. At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives. Leveraging a declarative query language, users are enabled to build various analysis tools (e.g., "drill-down", "comparative", "what-if" analysis) via flexibly composing such operators. We prototype i-Algebra and conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.
翻译:为深神经网络(DNN)提供解释,对于在决定可解释性是关键先决条件的领域中使用这些网络至关重要。尽管在解释DNN方面做了大量工作,但大多数现有解决办法都以临时、一次性和静态的方式提供解释,而不考虑最终用户的看法、理解或反应,造成最终用户的实际可用性差。在本文中,我们主张DNN可解释性应作为用户和模型之间的相互作用加以实施。我们提出了i-Algebra,这是解释DNN的首创互动框架。其核心是原子可兼容操作者图书馆,它解释了不同输入颗粒的模型行为,在不同推论阶段,以及不同的解释角度。使用一种宣示性查询语言,用户能够通过灵活组合这些操作者来建立各种分析工具(例如,“滴落”、“比较”、“比较”、“什么”分析。我们原型i-Algebra和用户在一套具有代表性的分析工作中进行了研究,包括检查其可逆性、纠正性、解决模式不一致和展示其有希望的数据污染。