One of the most common problems preventing the application of prediction models in the real world is lack of generalization: The accuracy of models, measured in the benchmark does repeat itself on future data, e.g. in the settings of real business. There is relatively little methods exist that estimate the confidence of prediction models. In this paper, we propose novel methods that, given a neural network classification model, estimate uncertainty of particular predictions generated by this model. Furthermore, we propose a method that, given a model and a confidence level, calculates a threshold that separates prediction generated by this model into two subsets, one of them meets the given confidence level. In contrast to other methods, the proposed methods do not require any changes on existing neural networks, because they simply build on the output logit layer of a common neural network. In particular, the methods infer the confidence of a particular prediction based on the distribution of the logit values corresponding to this prediction. The proposed methods constitute a tool that is recommended for filtering predictions in the process of knowledge extraction, e.g. based on web scrapping, where predictions subsets are identified that maximize the precision on cost of the recall, which is less important due to the availability of data. The method has been tested on different tasks including relation extraction, named entity recognition and image classification to show the significant increase of accuracy achieved.
翻译:妨碍在现实世界应用预测模型的最常见问题之一是缺乏普遍性:基准中衡量的模型的准确性确实重复了未来数据,例如实际商业环境中的数据。目前没有多少方法来估计预测模型的信心。在本文中,我们提出新方法,根据神经网络分类模型,估计该模型产生的特定预测的不确定性。此外,我们提出一种方法,根据模型和信心水平,计算一个阈值,将这一模型产生的预测分为两个子集,其中一个子集达到给定的信任水平。与其他方法不同,拟议方法并不要求对现有神经网络作任何改变,因为它们只是建立在一个共同神经网络的产出逻辑层上。特别是,根据该模型的逻辑值分布,推断特定预测的信心的方法。拟议方法构成一种工具,根据模型和信心水平,将这一模型产生的预测分为两个子集,其中一个达到给定的信任水平。与其他方法相反,拟议的方法并不要求对现有神经网络网络作任何改变,因为它们仅仅建立在共同神经网络的产出逻辑层之上。特别是,根据与该模型相应的逻辑值值值值值的分布,因此,对特定预测的可靠性的准确性进行了较不那么重要的确认。