The genuine supervision of modern IT systems brings new challenges as it requires higher standards of scalability, reliability and efficiency when analysing and monitoring big data streams. Rule-based inference engines are a key component of maintenance systems in detecting anomalies and automating their resolution. However, they remain confined to simple and general rules and cannot handle the huge amount of data, nor the large number of alerts raised by IT systems, a lesson learned from expert systems era. Artificial Intelligence for Operation Systems (AIOps) proposes to take advantage of advanced analytics and machine learning on big data to improve and automate every step of supervision systems and aid incident management in detecting outages, identifying root causes and applying appropriate healing actions. Nevertheless, the best AIOps techniques rely on opaque models, strongly limiting their adoption. As a part of this PhD thesis, we study how Subgroup Discovery can help AIOps. This promising data mining technique offers possibilities to extract interesting hypothesis from data and understand the underlying process behind predictive models. To ensure relevancy of our propositions, this project involves both data mining researchers and practitioners from Infologic, a French software editor.
翻译:对现代信息技术系统的真正监督带来了新的挑战,因为在分析和监测大数据流时,需要更高的可扩展性、可靠性和效率标准,基于规则的推断引擎是发现异常现象和使其分辨率自动化的维护系统的关键组成部分,然而,它们仍然局限于简单和一般性的规则,无法处理大量的数据,也无法处理信息技术系统引起的大量警报,也无法处理从专家系统时代吸取的教益,从专家系统时代学到的大量警报。操作系统人工智能(AIOps)建议利用大型数据的先进分析和机器学习来改进和自动化监督系统的每一个步骤,并协助事故管理,以发现断层、查明根源和应用适当的治疗行动。然而,最佳的AIOps技术依靠不透明的模型,严重限制其采用。作为博士论文的一部分,我们研究分组Discovery如何帮助AIOps。这种有希望的数据挖掘技术有可能从数据中提取有趣的假设,并了解预测模型背后的内在过程。为了确保我们的观点的再现性,这个项目涉及来自法国软件编辑Interlogic的数据开采研究人员和从业者。