Most explanation methods in deep learning map importance estimates for a model's prediction back to the original input space. These "visual" explanations are often insufficient, as the model's actual concept remains elusive. Moreover, without insights into the model's semantic concept, it is difficult -- if not impossible -- to intervene on the model's behavior via its explanations, called Explanatory Interactive Learning. Consequently, we propose to intervene on a Neuro-Symbolic scene representation, which allows one to revise the model on the semantic level, e.g. "never focus on the color to make your decision". We compiled a novel confounded visual scene data set, the CLEVR-Hans data set, capturing complex compositions of different objects. The results of our experiments on CLEVR-Hans demonstrate that our semantic explanations, i.e. compositional explanations at a per-object level, can identify confounders that are not identifiable using "visual" explanations only. More importantly, feedback on this semantic level makes it possible to revise the model from focusing on these factors.
翻译:深深学习地图中大多数解释方法对于模型预测回到原始输入空间的重要性估计。 这些“ 视觉”解释往往不够充分, 因为模型的实际概念仍然难以找到。 此外, 如果不深入了解模型的语义概念, 很难( 即便不是不可能)通过解释来干预模型的行为, 称为解释性互动学习。 因此, 我们提议对神经- 交替的场景表示进行干预, 从而可以修改语义层次上的模型, 比如“ 永远不要关注颜色来作出决定 ” 。 我们编集了一个新型的视觉数据集, CLEVR- Hans 数据集, 捕捉不同对象的复杂构成。 我们在CLEVR-Hans 上实验的结果表明, 我们的语义解释, 即: 单子层次的构成解释, 能够识别无法识别的词义层面, 例如“ 视觉” 解释。 更重要的是, 对语义层次的反馈使得模型有可能从关注这些因素的角度来修改。