Effective regularization techniques are highly desired in deep learning for alleviating overfitting and improving generalization. This work proposes a new regularization scheme, based on the understanding that the flat local minima of the empirical risk cause the model to generalize better. This scheme is referred to as adversarial model perturbation (AMP), where instead of directly minimizing the empirical risk, an alternative "AMP loss" is minimized via SGD. Specifically, the AMP loss is obtained from the empirical risk by applying the "worst" norm-bounded perturbation on each point in the parameter space. Comparing with most existing regularization schemes, AMP has strong theoretical justifications, in that minimizing the AMP loss can be shown theoretically to favour flat local minima of the empirical risk. Extensive experiments on various modern deep architectures establish AMP as a new state of the art among regularization schemes. Our code is available at https://github.com/hiyouga/AMP-Regularizer.
翻译:在深思熟虑以缓解过分适应和改进一般化的过程中,非常希望有效的正规化技术。这项工作提出了一个新的正规化计划,其依据是经验风险的平板地方缩微图促使模型更加概括化。这个计划被称为对抗性模型扰动(AMP ), 即通过SGD将“AMP损失”的替代方法减少到最低程度,而不是直接将经验风险降到最低程度。具体地说,AMP损失是通过在参数空间的每个点应用“最坏”规范限制的侵扰而从经验风险中获得的。与大多数现有的正规化计划相比,AMP有很强的理论依据,即从理论上讲,最大限度地减少AMP损失可以表明有利于经验风险的局部小型模型。关于各种现代深层建筑的广泛实验将AMP确立为正规化计划的新状态。我们的代码可在https://github.com/hiyouga/AMP-Regularizer查阅。