The lack of interpretability and transparency are preventing economists from using advanced tools like neural networks in their empirical research. In this paper, we propose a class of interpretable neural network models that can achieve both high prediction accuracy and interpretability. The model can be written as a simple function of a regularized number of interpretable features, which are outcomes of interpretable functions encoded in the neural network. Researchers can design different forms of interpretable functions based on the nature of their tasks. In particular, we encode a class of interpretable functions named persistent change filters in the neural network to study time series cross-sectional data. We apply the model to predicting individual's monthly employment status using high-dimensional administrative data. We achieve an accuracy of 94.5% in the test set, which is comparable to the best performed conventional machine learning methods. Furthermore, the interpretability of the model allows us to understand the mechanism that underlies the prediction: an individual's employment status is closely related to whether she pays different types of insurances. Our work is a useful step towards overcoming the black-box problem of neural networks, and provide a new tool for economists to study administrative and proprietary big data.
翻译:缺乏可解释性和透明性使经济学家无法使用神经网络等先进工具。 在本文中,我们提出了一系列可解释的神经网络模型,既能达到高预测准确度,又能达到高可解释性。该模型可以作为常规解释性特征的一个简单功能,这些特征是神经网络编码的可解释功能的结果。研究人员可以根据其任务的性质设计不同形式的可解释功能。特别是,我们将神经网络中称为持续变化过滤器的可解释功能分类为一类,以研究时序跨区段数据。我们运用该模型来预测个人月就业状况,使用高维度行政数据。我们在测试集中实现了94.5%的准确性,这与最佳的常规机器学习方法相当。此外,该模型的可解释性使我们能够理解预测所依据的机制:个人的就业状况与她是否支付不同种类的保险密切相关。我们的工作是克服神经网络黑盒问题的一个有益步骤,并且提供了一种用于行政及重大数据学研究的新的工具。