A new ensemble framework for interpretable model called Linear Iterative Feature Embedding (LIFE) has been developed to achieve high prediction accuracy, easy interpretation and efficient computation simultaneously. The LIFE algorithm is able to fit a wide single-hidden-layer neural network (NN) accurately with three steps: defining the subsets of a dataset by the linear projections of neural nodes, creating the features from multiple narrow single-hidden-layer NNs trained on the different subsets of the data, combining the features with a linear model. The theoretical rationale behind LIFE is also provided by the connection to the loss ambiguity decomposition of stack ensemble methods. Both simulation and empirical experiments confirm that LIFE consistently outperforms directly trained single-hidden-layer NNs and also outperforms many other benchmark models, including multi-layers Feed Forward Neural Network (FFNN), Xgboost, and Random Forest (RF) in many experiments. As a wide single-hidden-layer NN, LIFE is intrinsically interpretable. Meanwhile, both variable importance and global main and interaction effects can be easily created and visualized. In addition, the parallel nature of the base learner building makes LIFE computationally efficient by leveraging parallel computing.
翻译:为了同时实现高预测准确性、简单解释和高效计算,LIFE算法能够精确地与三个步骤匹配一个广泛的单一隐藏层神经网络(NN):界定神经节点线性预测所构成的数据集子集,创造在数据不同子集上受过培训的多层小型单隐藏层NNS的特征,将这些特征与线性模型结合起来。LIFE的理论原理也是由堆堆堆混合方法损失模糊分解的关联而提供的。两个模拟和实验实验都证实,生命始终比经过直接训练的单隐藏层神经网络(NNNN)高,还超越许多其他基准模型,包括许多实验中的多层进料前神经网络(FFNN)、Xgboost和随机森林(RFRF)的特征。一个宽度的单一隐藏层NNNN(LE)是内在的解释。同时,具有可变重要性和全球主要和互动影响,通过平行的计算,能够通过视觉和平行的计算,使LIFEV的计算变得容易。