Artificial neural networks have proven to be extremely useful models that have allowed for multiple recent breakthroughs in the field of Artificial Intelligence and many others. However, they are typically regarded as black boxes, given how difficult it is for humans to interpret how these models reach their results. In this work, we propose a method which allows one to modify what an artificial neural network is perceiving regarding specific human-defined concepts, enabling the generation of hypothetical scenarios that could help understand and even debug the neural network model. Through empirical evaluation, in a synthetic dataset and in the ImageNet dataset, we test the proposed method on different models, assessing whether the performed manipulations are well interpreted by the models, and analyzing how they react to them.
翻译:人工神经网络已被证明是极为有用的模型,这些模型使得人造情报领域和许多其他领域最近取得了许多突破,然而,由于人类很难解释这些模型如何取得结果,这些网络通常被视为黑盒。 在这项工作中,我们提出一种方法,使人们可以修改人工神经网络对特定人类定义概念的认知,从而产生有助于理解甚至调和神经网络模型的假设情景。 通过经验评估,在合成数据集和图像网络数据集中,我们测试了不同模型的拟议方法,评估模型是否很好地解释了操作过程,并分析它们如何应对这些模型。</s>