The use of neural networks has been very successful in a wide variety of applications. However, it has recently been observed that it is difficult to generalize the performance of neural networks under the condition of distributional shift. Several efforts have been made to identify potential out-of-distribution inputs. Although existing literature has made significant progress with regard to images and textual data, finance has been overlooked. The aim of this paper is to investigate the distribution shift in the credit scoring problem, one of the most important applications of finance. For the potential distribution shift problem, we propose a novel two-stage model. Using the out-of-distribution detection method, data is first separated into confident and unconfident sets. As a second step, we utilize the domain knowledge with a mean-variance optimization in order to provide reliable bounds for unconfident samples. Using empirical results, we demonstrate that our model offers reliable predictions for the vast majority of datasets. It is only a small portion of the dataset that is inherently difficult to judge, and we leave them to the judgment of human beings. Based on the two-stage model, highly confident predictions have been made and potential risks associated with the model have been significantly reduced.
翻译:神经网络的使用在各种各样的应用中非常成功。然而,最近观察到,在分布转移条件下很难推广神经网络的性能。虽然现有文献在图像和文本数据方面取得重大进展,但财务却被忽视了。本文件的目的是调查信用评分问题的分配变化,这是融资的最重要应用之一。对于潜在的分配转移问题,我们提出了一个新的两阶段模型。使用分配外检测方法,数据首先被分为自信和不自信的一组。作为第二步,我们利用平均差异优化的域知识,以便为不自信的样本提供可靠的界限。我们利用经验结果,证明我们的模型为绝大多数数据集提供了可靠的预测。在数据集中只有很小的一部分本来就难以判断,我们把它们留给人类的判断。根据两阶段模型,已经作出高度自信的预测,并且与模型相关的风险已经大大降低。