未来培训:一个简单的渐进式内插损失,以便随时间而普遍化 (Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time)

In several real world applications, machine learning models are deployed to make predictions on data whose distribution changes gradually along time, leading to a drift between the train and test distributions. Such models are often re-trained on new data periodically, and they hence need to generalize to data not too far into the future. In this context, there is much prior work on enhancing temporal generalization, e.g. continuous transportation of past data, kernel smoothed time-sensitive parameters and more recently, adversarial learning of time-invariant features. However, these methods share several limitations, e.g, poor scalability, training instability, and dependence on unlabeled data from the future. Responding to the above limitations, we propose a simple method that starts with a model with time-sensitive parameters but regularizes its temporal complexity using a Gradient Interpolation (GI) loss. GI allows the decision boundary to change along time and can still prevent overfitting to the limited training time snapshots by allowing task-specific control over changes along time. We compare our method to existing baselines on multiple real-world datasets, which show that GI outperforms more complicated generative and adversarial approaches on the one hand, and simpler gradient regularization methods on the other.

翻译：在几个现实世界应用中,安装机器学习模型,对分布随时间逐渐变化、导致火车与测试分布之间漂移的数据作出预测,这些模型往往定期地对新数据进行再培训,因此需要将数据推广到未来不太远的数据。在这方面,许多先前的工作都是为了提高时间的概括化,例如连续传输过去的数据,平滑的内核时间敏感参数,以及最近对时间变化特点的对抗性学习。然而,这些方法有一些共同的局限性,例如可缩缩缩缩、培训不稳定和对未来未标数据的依赖性。为了应对上述局限性,我们提出了一个简单的方法,从具有时间敏感参数的模型开始,但用渐进式加固的加固的加固的加固的调时复杂度来调整。GI允许决定边界随着时间的变化而改变,并且仍然能够通过允许随时间的变化来控制特定任务的变化来防止过度适应有限的培训时间缩影。我们的方法与多个实体数据集的现有基线进行了比较,这些基准显示GI的基因变异和对抗性方法更为复杂。