By highlighting the regions of the input image that contribute the most to the decision, saliency maps have become a popular method to make neural networks interpretable. In medical imaging, they are particularly well-suited to explain neural networks in the context of abnormality localization. However, from our experiments, they are less suited to classification problems where the features that allow to distinguish between the different classes are spatially correlated, scattered and definitely non-trivial. In this paper we thus propose a new paradigm for better interpretability. To this end we provide the user with relevant and easily interpretable information so that he can form his own opinion. We use Disentangled Variational Auto-Encoders which latent representation is divided into two components: the non-interpretable part and the disentangled part. The latter accounts for the categorical variables explicitly representing the different classes of interest. In addition to providing the class of a given input sample, such a model offers the possibility to transform the sample from a given class to a sample of another class, by modifying the value of the categorical variables in the latent representation. This paves the way to easier interpretation of class differences. We illustrate the relevance of this approach in the context of automatic sex determination from hip bones in forensic medicine. The features encoded by the model, that distinguish the different classes were found to be consistent with expert knowledge.
翻译:突出的地图通过突出最有助于决定的输入图像区域,已成为一种受欢迎的方法,使神经网络可以解释。在医学成像中,它们特别适合于解释异常地方化背景下的神经网络。然而,从我们的实验来看,它们不太适合分类问题,因为不同类别之间区分的特征在空间上相互关联,分散,而且绝对不是三角。在本文件中,我们因此提出了一个更好的解释性的新范例。为此目的,我们向用户提供相关和易于解释的信息,以便他能够形成自己的观点。我们使用分解的动态自动编码器,其潜在代表性被分为两个部分:非互换部分和分解部分。后者说明了明确代表不同类别兴趣的绝对变量。除了提供特定投入样本的类别外,这种模型还提供了将样本从某一类别转换为另一类样本的可能性,通过修改潜在表述中直截变量的价值。我们用不同的变式模型铺平了一条途径,将潜在代表方式分为一个较容易理解的版本。我们通过自动解析法理法理学模型,从这一类中找出了该类差异的关联性。