One of the key challenges of learning an online recommendation model is the temporal domain shift, which causes the mismatch between the training and testing data distribution and hence domain generalization error. To overcome, we propose to learn a meta future gradient generator that forecasts the gradient information of the future data distribution for training so that the recommendation model can be trained as if we were able to look ahead at the future of its deployment. Compared with Batch Update, a widely used paradigm, our theory suggests that the proposed algorithm achieves smaller temporal domain generalization error measured by a gradient variation term in a local regret. We demonstrate the empirical advantage by comparing with various representative baselines.
翻译:学习在线建议模式的主要挑战之一是时间域变换,这导致培训和测试数据分布之间的不匹配,从而造成域的概括错误。为了克服这一错误,我们提议学习一个未来的元梯度生成器,该元梯度生成器预测未来培训数据分布的梯度信息,以便可以将建议模式培训成我们是否能够展望其部署的未来。 与广泛使用的批量更新模式相比,我们的理论认为,拟议的算法在当地遗憾的情况下通过梯度变异的术语测量,实现了较小的时间域变异错误。我们通过比较各种有代表性的基线,展示了经验上的优势。