Interpretability has become a necessary feature for machine learning models deployed in critical scenarios, e.g. legal systems, healthcare. In these situations, algorithmic decisions may have (potentially negative) long-lasting effects on the end-user affected by the decision. In many cases, the representational power of deep learning models is not needed, therefore simple and interpretable models (e.g. linear models) should be preferred. However, in high-dimensional and/or complex domains (e.g. computer vision), the universal approximation capabilities of neural networks is required. Inspired by linear models and the Kolmogorov-Arnol representation theorem, we propose a novel class of structurally-constrained neural networks, which we call FLANs (Feature-wise Latent Additive Networks). Crucially, FLANs process each input feature separately, computing for each of them a representation in a common latent space. These feature-wise latent representations are then simply summed, and the aggregated representation is used for prediction. These constraints (which are at the core of the interpretability of linear models) allow an user to estimate the effect of each individual feature independently from the others, enhancing interpretability. In a set of experiments across different domains, we show how without compromising excessively the test performance, the structural constraints proposed in FLANs indeed increase the interpretability of deep learning models.
翻译:解释性已成为在关键情景下部署的机器学习模型的一个必要特征,如法律制度、医疗保健等。在这种情况下,算法决定可能对受决定影响的最终用户产生(潜在负)长期影响。在许多情况下,不需要深层次学习模型的代表性力量,因此,应该倾向于简单和可解释的模式(如线性模型)。然而,在高维和/或复杂领域(如计算机愿景),需要建立神经网络的通用近似能力。在线性模型和科尔莫戈洛-阿尔诺尔代表理论模型的启发下,我们提议建立新型结构上不受限制的神经网络(我们称之为FLANs(Fature-witter-witter Additive网络))的新类别,因此,深层次而言之,FLANs处理每个输入特征,在共同的潜伏空间中分别计算每个特征。然后简单地概括这些特征的潜伏表,然后使用总体代表来进行预测。这些制约因素(是线性模型的核心)使得用户能够评估每个结构上受到限制的神经性网络网络,我们确实地评估了每个特性,从其他方面独立地检验了每个特性。