关于在分配外数据流方面持续进行模型改进 (On Continual Model Refinement in Out-of-Distribution Data Streams)

Real-world natural language processing (NLP) models need to be continually updated to fix the prediction errors in out-of-distribution (OOD) data streams while overcoming catastrophic forgetting. However, existing continual learning (CL) problem setups cannot cover such a realistic and complex scenario. In response to this, we propose a new CL problem formulation dubbed continual model refinement (CMR). Compared to prior CL settings, CMR is more practical and introduces unique challenges (boundary-agnostic and non-stationary distribution shift, diverse mixtures of multiple OOD data clusters, error-centric streams, etc.). We extend several existing CL approaches to the CMR setting and evaluate them extensively. For benchmarking and analysis, we propose a general sampling algorithm to obtain dynamic OOD data streams with controllable non-stationarity, as well as a suite of metrics measuring various aspects of online performance. Our experiments and detailed analysis reveal the promise and challenges of the CMR problem, supporting that studying CMR in dynamic OOD streams can benefit the longevity of deployed NLP models in production.

翻译：现实世界自然语言处理模型(NLP)需要不断更新,以纠正分配外数据流的预测错误,同时克服灾难性的遗忘;然而,现有的持续学习问题设置无法涵盖这种现实和复杂的情景;对此,我们提议一种新的CL问题提法,称为持续改进模型(CMR)。与以前CL设置相比,CMR更切合实际,并提出了独特的挑战(边界-敏感和非静止分布转移、多种OOOD数据群的多种混合物、以错误为中心的流等)。我们将现有的CL方法推广到CMR设置,并对其进行广泛评估。关于基准和分析,我们建议采用一般的抽样算法,以获得具有可控非静止性的动态ODD数据流,以及一套衡量在线业绩各个方面的计量标准。我们的实验和详细分析揭示了CMR问题的前景和挑战,支持在动态OOD流中研究CMR可有益于已部署的NLP模型的寿命。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日