While deep neural networks are capable of achieving state-of-the-art performance in various domains, their training typically requires iterating for many passes over the dataset. However, due to computational and memory constraints and potential privacy concerns, storing and accessing all the data is impractical in many real-world scenarios where the data arrives in a stream. In this paper, we investigate the problem of one-pass learning, in which a model is trained on sequentially arriving data without retraining on previous datapoints. Motivated by the increasing use of overparameterized models, we develop Orthogonal Recursive Fitting (ORFit), an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints. By doing so, we bridge two seemingly distinct algorithms in adaptive filtering and machine learning, namely the recursive least-squares (RLS) algorithm and orthogonal gradient descent (OGD). Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA). Further, we show that, for overparameterized linear models, the parameter vector obtained by our algorithm is what stochastic gradient descent (SGD) would converge to in the standard multi-pass setting. Finally, we generalize the results to the nonlinear setting for highly overparameterized models, relevant for deep learning. Our experiments show the effectiveness of the proposed method compared to the baselines.
翻译:虽然深心神经网络能够在不同领域达到最先进的性能,但其培训通常要求对数据集上的许多数据进行迭代。然而,由于计算和内存限制以及潜在的隐私问题,在许多数据进入流流的真实世界情景中,储存和获取所有数据是不切实际的。在本文件中,我们调查了一次性学习问题,在这种学习中,一个模型在不以先前数据点再培训的情况下,就按顺序获取数据进行了培训。在越来越多地使用过度分解模型的推动下,我们开发了“正弦再偏向”模型(ORFit),这是一种一次性学习的算法,它试图完全适应每一个新的数据点,同时将参数改变方向,导致对以往数据点的预测发生最小的变化。我们这样做,将适应过滤和机学习两个看起来不同的算法连接起来,即:循环最小平方算法和直线梯度梯度梯度下(OGD)。我们算算法有效地利用流数据的结构,通过一个渐进式主控点分析法,将我们所获取的不连续流数据结构进行对比。我们用SIPCAS 的递定的递定的递归结果,我们将显示我们总平级的递定的递定为高阶的递归。