Automatic log file analysis enables early detection of relevant incidents such as system failures. In particular, self-learning anomaly detection techniques capture patterns in log data and subsequently report unexpected log event occurrences to system operators without the need to provide or manually model anomalous scenarios in advance. Recently, an increasing number of approaches leveraging deep learning neural networks for this purpose have been presented. These approaches have demonstrated superior detection performance in comparison to conventional machine learning techniques and simultaneously resolve issues with unstable data formats. However, there exist many different architectures for deep learning and it is non-trivial to encode raw and unstructured log data to be analyzed by neural networks. We therefore carry out a systematic literature review that provides an overview of deployed models, data pre-processing mechanisms, anomaly detection techniques, and evaluations. The survey does not quantitatively compare existing approaches but instead aims to help readers understand relevant aspects of different model architectures and emphasizes open issues for future work.
翻译:自动日志文件分析有助于早期发现系统故障等相关事件。特别是,自学异常探测技术在日志数据中捕捉模式,随后向系统操作者报告意外日志事件,而无需事先提供或人工示范异常情况。最近,提出了越来越多的为此目的利用深层学习神经网络的方法。这些方法显示,与常规机器学习技术相比,探测效果优于常规机器学习技术,并同时解决数据格式不稳定的问题。然而,存在许多不同的深层学习结构,将原始和非结构的日志数据编码为原始和非结构的日志数据供神经网络分析是非技术性的。因此,我们开展了系统文献审查,对已部署模型、数据预处理机制、异常探测技术和评价作了概述。调查没有从数量上比较现有方法,而是旨在帮助读者了解不同模型结构的相关方面,并强调未来工作的未决问题。