Interpretability has become a necessary feature for machine learning models deployed in critical scenarios, e.g. legal system, healthcare. In these situations, algorithmic decisions may have (potentially negative) long-lasting effects on the end-user affected by the decision. In many cases, the representational power of deep learning models is not needed, therefore simple and interpretable models (e.g. linear models) should be preferred. However, in high-dimensional and/or complex domains (e.g. computer vision), the universal approximation capabilities of neural networks are required. Inspired by linear models and the Kolmogorov-Arnold representation theorem, we propose a novel class of structurally-constrained neural networks, which we call FLANs (Feature-wise Latent Additive Networks). Crucially, FLANs process each input feature separately, computing for each of them a representation in a common latent space. These feature-wise latent representations are then simply summed, and the aggregated representation is used for prediction. These constraints (which are at the core of the interpretability of linear models) allow a user to estimate the effect of each individual feature independently from the others, enhancing interpretability. In a set of experiments across different domains, we show how without compromising excessively the test performance, the structural constraints proposed in FLANs indeed facilitates the interpretability of deep learning models. We quantitatively compare FLANs interpretability to post-hoc methods using recently introduced metrics, discussing the advantages of natively interpretable models over a post-hoc analysis.
翻译:解释性已成为在关键情景(如法律制度、医疗保健)中部署的机器学习模型的一个必要特征。在这种情况下,算法决定可能对受决定影响的终端用户产生(潜在负)长期影响。在许多情况下,不需要深层次学习模型的代表性力量,因此,应该倾向于简单和可解释的模式(如线性模型)。然而,在高维和(或)复杂领域(如计算机愿景),需要建立神经网络的通用近似能力。在线性模型和科尔莫戈洛夫-阿诺尔德代言方的启发下,我们提出一个结构上不受限制的神经网络的新型类(可能为负)长期影响。我们称之为FLANs(Fature-with Lent Additive 网络) 在许多情况下,FLANs处理每种输入特征,在共同的潜伏空间中分别计算一个代表。这些特征的潜伏性表述随后简单地加以概括,并使用总体代表来进行预测。这些制约因素(是线性模型的核心)使得用户能够对结构上受限制性进行更深层次的网络分析,而不用独立地对每个域域域进行解释。我们所拟议的弹性解释,从其他特性来独立地展示了各种可变性解释。