Replay methods have shown to be successful in mitigating catastrophic forgetting in continual learning scenarios despite having limited access to historical data. However, storing historical data is cheap in many real-world applications, yet replaying all historical data would be prohibited due to processing time constraints. In such settings, we propose learning the time to learn for a continual learning system, in which we learn replay schedules over which tasks to replay at different time steps. To demonstrate the importance of learning the time to learn, we first use Monte Carlo tree search to find the proper replay schedule and show that it can outperform fixed scheduling policies in terms of continual learning performance. Moreover, to improve the scheduling efficiency itself, we propose to use reinforcement learning to learn the replay scheduling policies that can generalize to new continual learning scenarios without added computational cost. In our experiments, we show the advantages of learning the time to learn, which brings current continual learning research closer to real-world needs.
翻译:尽管获得历史数据的机会有限,但重播方法证明成功地减轻了持续学习情景中的灾难性遗忘,尽管获得历史数据的机会有限。然而,储存历史数据在许多现实世界的应用中是廉价的,然而,由于处理时间的限制,重播所有历史数据将被禁止。在这种环境下,我们提议学习时间来学习一个持续学习的系统,在这个系统中,我们学习重播在不同时间步骤重播任务的时间表。为了表明学习时间的重要性,我们首先利用蒙特卡洛树搜索找到正确的重播时间表,并表明它能够比固定的时间安排政策更符合持续学习的绩效。此外,为了提高时间安排的效率,我们提议利用强化学习学习学习学习重播时间安排政策,这种政策可以概括为新的持续学习情景,而不增加计算成本。我们在实验中展示了学习时间的优势,让当前的持续学习研究更接近现实世界的需要。