While deep networks can learn complex functions such as classifiers, detectors, and trackers, many applications require models that continually adapt to changing input distributions, changing tasks, and changing environmental conditions. Indeed, this ability to continuously accrue knowledge and use past experience to learn new tasks quickly in continual settings is one of the key properties of an intelligent system. For complex and high-dimensional problems, simply updating the model continually with standard learning algorithms such as gradient descent may result in slow adaptation. Meta-learning can provide a powerful tool to accelerate adaptation yet is conventionally studied in batch settings. In this paper, we study how meta-learning can be applied to tackle online problems of this nature, simultaneously adapting to changing tasks and input distributions and meta-training the model in order to adapt more quickly in the future. Extending meta-learning into the online setting presents its own challenges, and although several prior methods have studied related problems, they generally require a discrete notion of tasks, with known ground-truth task boundaries. Such methods typically adapt to each task in sequence, resetting the model between tasks, rather than adapting continuously across tasks. In many real-world settings, such discrete boundaries are unavailable, and may not even exist. To address these settings, we propose a Fully Online Meta-Learning (FOML) algorithm, which does not require any ground truth knowledge about the task boundaries and stays fully online without resetting back to pre-trained weights. Our experiments show that FOML was able to learn new tasks faster than the state-of-the-art online learning methods on Rainbow-MNIST, CIFAR100 and CELEBA datasets.
翻译:虽然深层次的网络可以学习分类、探测器和跟踪器等复杂功能,但许多应用需要不断适应不断变化的投入分布、变化的任务和环境条件变化的模式。事实上,这种不断积累知识和利用过去经验在连续环境中快速学习新任务的能力是智能系统的关键特性之一。对于复杂和高层次的问题,只要不断更新模型,采用标准学习算法,如梯度下降,可能会导致缓慢的适应。元学习可以提供加快适应速度的强大工具,但在批量设置中却进行常规研究。在本文中,我们研究如何应用元学习来解决这种性质的在线问题,同时适应变化的任务和投入分配,并进行元培训,以便在未来更迅速地适应新任务。将元学习扩展到在线环境,这是智能系统本身的挑战,虽然以前的一些方法已经研究了相关问题,但通常需要有一个不固定的任务概念,有已知的地面任务界限。这类方法通常会重新适应每个任务顺序,在任务之间重新确定一个模型,而不是在任务之间不断调整。在许多现实世界环境中,这种离心线式的计算和离心线的界限可能要求我们没有真正的实地任务。