LEAF: 在细胞网络中导航潜入概念 (LEAF: Navigating Concept Drift in Cellular Networks)

Operational networks commonly rely on machine learning models for many tasks, including detecting anomalies, inferring application performance, and forecasting demand. Yet, unfortunately, model accuracy can degrade due to concept drift, whereby the relationship between the features and the target prediction changes due to reasons ranging from software upgrades to seasonality to changes in user behavior. Mitigating concept drift is thus an essential part of operationalizing machine learning models, and yet despite its importance, concept drift has not been extensively explored in the context of networking -- or regression models in general. Thus, it is not well-understood how to detect or mitigate it for many common network management tasks that currently rely on machine learning models. As we show, concept drift cannot always be mitigated by periodic retraining models using newly available data, and doing so can even degrade model accuracy. In this paper, we characterize concept drift in a large cellular network for a metropolitan area in the United States. We find that concept drift occurs across key performance indicators (KPIs), regardless of model, training set size, and time interval -- thus necessitating practical approaches to detect, explain, and mitigate it. To do so, we develop Local Error Approximation of Features (LEAF). LEAF detects drift; explains features and time intervals that most contribute to drift; and mitigates drift using resampling, augmentation, or ensembling. We evaluate LEAF against industry-standard mitigations (i.e., periodic retraining) with more than three years of cellular data from Verizon. LEAF consistently outperforms periodic retraining on a variety of KPIs and models, while reducing costly retrains by an order of magnitude. Due to its effectiveness, a major cellular carrier is now integrating LEAF into its forecasting and provisioning processes.

翻译：操作网络通常依赖机器学习模式执行许多任务,包括发现异常现象、推断应用性能和预测需求。但不幸的是,模型准确性会因概念的漂移而降低,因为从软件升级到季节性到用户行为变化等原因,这些特点和目标预测变化之间的关系总是无法通过定期再培训模型来减轻,因此,缩小概念漂移是使机器学习模式投入运作的一个基本部分,然而尽管其重要性,概念漂移并没有在网络化或一般回归模型的范围内广泛探索,因此,对于目前依赖机器学习模型的许多共同网络管理任务,模型准确性会降低。正如我们所显示的那样,由于定期再培训模型使用新数据、季节性、甚至降低模型准确性,我们把概念漂移的概念漂移描述成美国大都市地区的大型蜂窝网络。我们发现,概念漂移发生在关键性绩效指标(KPI)中,而不论模式、培训设定规模和时间间隔如何,因此无法很好地探测、解释和减轻其规模,因此,我们不得不用定期的错误性调整和稳定度,同时解释其漂移的特性(LEAAAAAF),我们用定期的频率来减少其流变变变的频率,并测量。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日