Jitter: Random Jitling Loss 函数 (Jitter: Random Jittering Loss Function)

Regularization plays a vital role in machine learning optimization. One novel regularization method called flooding makes the training loss fluctuate around the flooding level. It intends to make the model continue to random walk until it comes to a flat loss landscape to enhance generalization. However, the hyper-parameter flooding level of the flooding method fails to be selected properly and uniformly. We propose a novel method called Jitter to improve it. Jitter is essentially a kind of random loss function. Before training, we randomly sample the Jitter Point from a specific probability distribution. The flooding level should be replaced by Jitter point to obtain a new target function and train the model accordingly. As Jitter point acting as a random factor, we actually add some randomness to the loss function, which is consistent with the fact that there exists innumerable random behaviors in the learning process of the machine learning model and is supposed to make the model more robust. In addition, Jitter performs random walk randomly which divides the loss curve into small intervals and then flipping them over, ideally making the loss curve much flatter and enhancing generalization ability. Moreover, Jitter can be a domain-, task-, and model-independent regularization method and train the model effectively after the training error reduces to zero. Our experimental results show that Jitter method can improve model performance more significantly than the previous flooding method and make the test loss curve descend twice.

翻译：在机器学习优化中, 常规化具有关键作用。一种叫洪水的新颖的正规化方法, 叫做“ 洪水”, 使培训损失在洪水水平上波动。它打算让模型继续随机行走, 直至形成一个平坦的损失景观, 以强化概括化。但是, 洪涝方法的超参数洪涝水平没有被正确和统一地选择。我们提议了一个叫作 Jitter 的新型方法来改进它。 Jitter 基本上是一种随机丢失功能。在培训之前, 我们随机地从特定的概率分布中抽取Jitter Point 。洪水水平应该由 Jitter 点取代, 以获得一个新的目标函数并相应培训模型。作为随机因素, 我们实际上会给损失函数添加一些随机性, 以强化总体化能力。与机器学习模型中存在无数随机行为的事实一致, 我们提议了一个叫Jitter 随机行走,, 将损失曲线分成一个小间隔, 然后翻转, 理想的是让损失曲线曲线变得非常贴, 并增强一般化能力。此外, Jitter 可以成为一个域、任务、和模型化的模型化方法, 能够有效地降低我们之前的模型, 的模型和模型化方法。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【万字长文】注意力机制可解释大论述

专知会员服务

55+阅读 · 2020年11月17日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日