Early stopping based on hold-out data is a popular regularization technique designed to mitigate overfitting and increase the predictive accuracy of neural networks. Models trained with early stopping often provide relatively accurate predictions, but they generally still lack precise statistical guarantees unless they are further calibrated using independent hold-out data. This paper addresses the above limitation with conformalized early stopping: a novel method that combines early stopping with conformal calibration while efficiently recycling the same hold-out data. This leads to models that are both accurate and able to provide exact predictive inferences without multiple data splits nor overly conservative adjustments. Practical implementations are developed for different learning tasks -- outlier detection, multi-class classification, regression -- and their competitive performance is demonstrated on real data.
翻译:以暂停数据为基础的早期停止是一种受欢迎的正规化技术,旨在减少超常和增加神经网络的预测准确性。经过早期停止培训的模式往往提供相对准确的预测,但通常仍然缺乏准确的统计保证,除非使用独立暂停数据进一步校准。本文件用符合规定的早期停止处理上述限制:一种新颖的方法,将早期停止与合规校准相结合,同时有效地回收同样的暂停数据。这导致一些模型既准确又能够提供准确的预测推论,而没有多重数据分割或过于保守的调整。为不同的学习任务 -- -- 外部检测、多级分类、回归 -- -- 制定了实用的实施方法,其竞争性表现在真实数据上得到了证明。