Most complex machine learning and modelling techniques are prone to over-fitting and may subsequently generalise poorly to future data. Artificial neural networks are no different in this regard and, despite having a level of implicit regularisation when trained with gradient descent, often require the aid of explicit regularisers. We introduce a new framework, Model Gradient Similarity (MGS), that (1) serves as a metric of regularisation, which can be used to monitor neural network training, (2) adds insight into how explicit regularisers, while derived from widely different principles, operate via the same mechanism underneath by increasing MGS, and (3) provides the basis for a new regularisation scheme which exhibits excellent performance, especially in challenging settings such as high levels of label noise or limited sample sizes.
翻译:最复杂的机器学习和建模技术容易过于完善,随后可能无法对未来数据加以概括,在这方面,人工神经网络没有区别,尽管在接受梯度下降培训时存在一定程度的隐性正规化,但往往需要明确的正规化制度。 我们引入了一个新的框架,即模型渐渐相似(MGS),这一框架(1) 用作常规化的衡量标准,可用于监测神经网络培训,(2) 进一步深入了解明确常规化制度如何在广泛不同的原则基础上,通过提高最低质量的同一机制运作,以及(3) 为新的正规化计划提供基础,该计划表现优异,特别是在具有挑战性的环境中,如高水平的标签噪音或有限的样本大小。