Real-world deployment of machine learning models is challenging when data evolves over time. And data does evolve over time. While no model can work when data evolves in an arbitrary fashion, if there is some pattern to these changes, we might be able to design methods to address it. This paper addresses situations when data evolves gradually. We introduce a novel time-varying importance weight estimator that can detect gradual shifts in the distribution of data. Such an importance weight estimator allows the training method to selectively sample past data -- not just similar data from the past like a standard importance weight estimator would but also data that evolved in a similar fashion in the past. Our time-varying importance weight is quite general. We demonstrate different ways of implementing it that exploit some known structure in the evolution of data. We demonstrate and evaluate this approach on a variety of problems ranging from supervised learning tasks (multiple image classification datasets) where the data undergoes a sequence of gradual shifts of our design to reinforcement learning tasks (robotic manipulation and continuous control) where data undergoes a shift organically as the policy or the task changes.
翻译:当数据随时间演变时,机器学习模型的实际部署具有挑战性。而数据确实会随时间演变。虽然当数据以任意的方式演变时,没有任何模型能够发挥作用,但如果这些变化有某种模式,我们也许能够设计出应对方法。本文论述数据演变时的情况。我们引入了一个新的时间变化式重要加权估计器,可以检测数据分布的逐渐变化。这种重要加权估计器使得培训方法能够有选择地抽样以往数据 -- -- 不仅仅是像标准重要性估计器那样的过去类似数据,而是过去以类似方式演变的数据。我们的时间变化式重要性加权数相当笼统。我们展示了执行它的不同方法,在数据演变过程中利用了某些已知的结构。我们展示和评价了这一方法涉及各种问题,从受监督的学习任务(多图像分类数据集)到我们设计的逐步转移到强化学习任务(操纵和连续控制)的顺序,在这些方面,数据在政策或任务变化中发生了有机的变化。