In this paper we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. Hutchinson method is a classical unbiased estimator for the trace of a matrix, but it is very time-consuming on deep learning models. Hence a dropout scheme is proposed to efficiently implements the Hutchinson method. Then we discuss a connection to linear stability of a nonlinear dynamical system and flat/sharp minima. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, confidence penalty, and label smoothing, cutout and mixup.
翻译:在本文中,我们为深神经网络开发了一种新型的正规化方法,惩罚黑森人的痕迹。这个常规化方法的动机是最近普遍化错误的保证。哈钦森方法是一个典型的不带偏见的跟踪矩阵的估算器,但对于深层学习模式来说非常费时。因此,我们提议了一个辍学计划,以便有效地实施哈钦森方法。然后我们讨论与非线性动态系统和平流/正流微型系统的线性稳定性的联系。实验表明,我们的方法优于现有的常规化和数据增强方法,如雅各布森、信心罚款和标签的平稳、裁剪和混搭。