Deep neural networks are powerful tools to detect hidden patterns in data and leverage them to make predictions, but they are not designed to understand uncertainty and estimate reliable probabilities. In particular, they tend to be overconfident. We begin to address this problem in the context of multi-class classification by developing a novel training algorithm producing models with more dependable uncertainty estimates, without sacrificing predictive power. The idea is to mitigate overconfidence by minimizing a loss function, inspired by advances in conformal inference, that quantifies model uncertainty by carefully leveraging hold-out data. Experiments with synthetic and real data demonstrate this method can lead to smaller conformal prediction sets with higher conditional coverage, after exact calibration with hold-out data, compared to state-of-the-art alternatives.
翻译:深神经网络是探测数据中隐藏模式并利用这些模式作出预测的有力工具,但它们的设计并不是用来理解不确定性和估计可靠概率的。特别是,它们往往过于自信。我们开始在多级分类的背景下解决这一问题,方法是开发一种新的培训算法,生成具有更可靠的不确定性估计值的模型,同时不牺牲预测力。 其想法是尽量减少损失功能,在符合性推论的进展的启发下,通过谨慎利用搁置数据对模型不确定性进行量化。 合成和真实数据的实验表明,这种方法在精确校准搁置数据之后,与最先进的替代方法相比,可以导致更小的符合性预测数据集,条件范围更高。