Edge computing was introduced as a technical enabler for the demanding requirements of new network technologies like 5G. It aims to overcome challenges related to centralized cloud computing environments by distributing computational resources to the edge of the network towards the customers. The complexity of the emerging infrastructures increases significantly, together with the ramifications of outages on critical use cases such as self-driving cars or health care. Artificial Intelligence for IT Operations (AIOps) aims to support human operators in managing complex infrastructures by using machine learning methods. This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments. The overhead of a high-frequency monitoring solution on edge devices is evaluated and performance experiments regarding the applicability of three anomaly detection algorithms on edge devices are conducted. The results show, that it is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices with a reasonable overhead on the resource utilization.
翻译:作为5G等新网络技术要求高的技术推进器,引入了边缘计算。其目的是通过向客户分配计算资源,克服与中央云计算环境有关的挑战,将计算资源分配到网络边缘;新兴基础设施的复杂程度大大增加,加上自驾驶汽车或保健等关键用途案例的停机的影响;信息技术业务人工智能(AIOps)的目的是支持人类操作者使用机器学习方法管理复杂的基础设施;本文描述了适用于多种分布式环境的AIOps平台的系统设计;对边缘设备高频监测解决方案的间接费用进行了评估,并对边缘设备三种异常探测算法的适用性进行了绩效实验;结果显示,收集高频率的计量数据,同时在边缘设备上直接运行具体的异常检测算法,同时对资源利用的合理间接费用进行合理的管理,是可行的。