Anomaly detection becomes increasingly important for the dependability and serviceability of IT services. As log lines record events during the execution of IT services, they are a primary source for diagnostics. Thereby, unsupervised methods provide a significant benefit since not all anomalies can be known at training time. Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary required for the anomaly detection task. This requirement poses practical limitations. Therefore, we develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision. First, we utilize a self-attention neural network to perform the scoring for each log message. Second, we set the decision boundary based on data augmentation of the available normal training data. The method is evaluated on three publicly available datasets and one industry dataset. We show that our approach outperforms existing methods. Furthermore, we utilize available anomaly examples to set optimal decision boundaries to acquire strong baselines. We show that our approach, which determines decision boundaries without utilizing anomaly examples, can reach scores of the strong baselines.
翻译:异常探测对于信息技术服务的可靠性和可使用性越来越重要。 日志线记录了信息技术服务实施过程中的事件, 它们是诊断的主要来源。 因此, 未经监督的方法提供了巨大的好处, 因为并非所有异常现象在培训时都可知道。 现有的未经监督的方法需要异常实例, 才能获得异常探测任务所需的适当决定边界。 这一要求带来了实际限制。 因此, 我们开发了 A2Log, 这是一种未经监督的异常探测方法, 由两个步骤组成: 异常评分和异常决定。 首先, 我们使用自备神经网络来为每条日志信息评分。 第二, 我们根据现有正常培训数据的数据增量来设定决定界限。 我们用三种公开可用的数据集和一个行业数据集来评估这一方法。 我们显示, 我们的方法超过了现有方法。 此外, 我们利用现有的异常实例来设定最佳决定边界以获得强有力的基线。 我们显示, 我们的方法, 在没有异常实例的情况下决定决定决定决定边界, 可以达到强基准的分数。