Classical machine learning algorithms often assume that the data are drawn i.i.d. from a stationary probability distribution. Recently, continual learning emerged as a rapidly growing area of machine learning where this assumption is relaxed, i.e. where the data distribution is non-stationary and changes over time. This paper represents the state of data distribution by a context variable $c$. A drift in $c$ leads to a data distribution drift. A context drift may change the target distribution, the input distribution, or both. Moreover, distribution drifts might be abrupt or gradual. In continual learning, context drifts may interfere with the learning process and erase previously learned knowledge; thus, continual learning algorithms must include specialized mechanisms to deal with such drifts. In this paper, we aim to identify and categorize different types of context drifts and potential assumptions about them, to better characterize various continual-learning scenarios. Moreover, we propose to use the distribution drift framework to provide more precise definitions of several terms commonly used in the continual learning field.
翻译:经典机器学习算法通常假定数据是从固定概率分布中提取的。最近,不断学习是一个迅速增长的机器学习领域,这个假设是松散的,即数据分布不是静止的,而且随着时间的变化而变化。本文以上下文变量表示数据分布状况。以美元计值的漂移导致数据分布漂移。背景漂移可能改变目标分布、输入分布,或两者兼而有之。此外,分布漂移可能是突然的或渐进的。在持续学习中,背景漂移可能干扰学习过程,并抹去先前学到的知识;因此,持续学习算法必须包括处理这种漂移的专门机制。在本文件中,我们的目标是确定和分类不同类型的背景漂移和关于它们的潜在假设,以更好地描述各种持续学习情景。此外,我们提议使用分布漂移框架来提供在持续学习领域常用的几个术语的更精确定义。