Current online learning methods suffer issues such as lower convergence rates and limited capability to recover the support of the true features compared to their offline counterparts. In this paper, we present a novel framework for online learning based on running averages and introduce a series of online versions of some popular existing offline methods such as Elastic Net, Minimax Concave Penalty and Feature Selection with Annealing. The framework can handle an arbitrarily large number of observations with the restriction that the data dimension is not too large, e.g. $p<50,000$. We prove the equivalence between our online methods and their offline counterparts and give theoretical true feature recovery and convergence guarantees for some of them. In contrast to the existing online methods, the proposed methods can extract models with any desired sparsity level at any time. Numerical experiments indicate that our new methods enjoy high accuracy of true feature recovery and a fast convergence rate, compared with standard online and offline algorithms. We also show how the running averages framework can be used for model adaptation in the presence of model drift. Finally, we present some applications to large datasets where again the proposed framework shows competitive results compared to popular online and offline algorithms.
翻译:当前的在线学习方法存在一些问题,例如,趋同率较低,与离线对应方相比,恢复真实特征支持的能力有限。在本文中,我们提出了一个基于运行平均数的在线学习新框架,并推出一系列广受欢迎的现有离线方法的在线版本,如Elastic Net、Minimax Concave惩罚和与Annealing的功能选择。这个框架可以处理大量任意的观测,限制数据维度不太大,例如,p < 50,000美元。我们证明我们的在线方法与其离线对应方之间的等值,并为其中一些方法提供了理论真实特征的恢复和融合保障。与现有的在线方法不同,拟议方法可以随时以任何理想的宽度提取模型。数字实验表明,我们的新方法与标准的在线和离线算法相比,真实特征恢复率和快速趋同率很高。我们还展示了运行平均数框架如何在模型漂浮时用于模型的适应。最后,我们向大型数据集展示了一些应用程序,其中拟议的框架再次显示与流行的在线和离线算法的竞争性结果。