The proliferation of deep neural networks in various domains has seen an increased need for the interpretability of these models, especially in scenarios where fairness and trust are as important as model performance. A lot of independent work is being carried out to: i) analyze what linguistic and non-linguistic knowledge is learned within these models, and ii) highlight the salient parts of the input. We present NxPlain, a web application that provides an explanation of a model's prediction using latent concepts. NxPlain discovers latent concepts learned in a deep NLP model, provides an interpretation of the knowledge learned in the model, and explains its predictions based on the used concepts. The application allows users to browse through the latent concepts in an intuitive order, letting them efficiently scan through the most salient concepts with a global corpus level view and a local sentence-level view. Our tool is useful for debugging, unraveling model bias, and for highlighting spurious correlations in a model. A hosted demo is available here: https://nxplain.qcri.org.
翻译:不同领域深层神经网络的扩展表明越来越需要对这些模型进行解释,特别是在公平和信任与模型性能同样重要的情景中。正在开展大量独立工作,以便:(一) 分析在这些模型中学习的语言和非语言知识,并(二) 突出投入的突出部分。我们介绍了NxPlain,这是一个网络应用程序,利用潜意识概念解释模型的预测。NxPlain发现了在深层NLP模型中学习的潜在概念,提供了对模型所学知识的解释,并解释了基于所用概念的预测。应用程序使用户能够以直观的方式浏览潜在概念,让他们通过最突出的概念,以全球实体一级的观点和当地判决一级的观点进行高效的扫描。我们的工具可用于解调、解析模型偏差,并突出模型中的虚假相关性。这里有主机演示:https://nxplain.qcri.org。</s>