Network or physical attacks on industrial equipment or computer systems may cause massive losses. Therefore, a quick and accurate anomaly detection (AD) based on monitoring data, especially the multivariate time-series (MTS) data, is of great significance. As the key step of anomaly detection for MTS data, learning the relations among different variables has been explored by many approaches. However, most of the existing approaches do not consider the heterogeneity between variables, that is, different types of variables (continuous numerical variables, discrete categorical variables or hybrid variables) may have different and distinctive edge distributions. In this paper, we propose a novel semi-supervised anomaly detection framework based on a heterogeneous feature network (HFN) for MTS, learning heterogeneous structure information from a mass of unlabeled time-series data to improve the accuracy of anomaly detection, and using attention coefficient to provide an explanation for the detected anomalies. Specifically, we first combine the embedding similarity subgraph generated by sensor embedding and feature value similarity subgraph generated by sensor values to construct a time-series heterogeneous graph, which fully utilizes the rich heterogeneous mutual information among variables. Then, a prediction model containing nodes and channel attentions is jointly optimized to obtain better time-series representations. This approach fuses the state-of-the-art technologies of heterogeneous graph structure learning (HGSL) and representation learning. The experiments on four sensor datasets from real-world applications demonstrate that our approach detects the anomalies more accurately than those baseline approaches, thus providing a basis for the rapid positioning of anomalies.
翻译:因此,基于监测数据,特别是多变时间序列(MTS)数据,快速和准确的异常检测(AD)具有重大意义。作为对多边贸易体系数据进行异常检测的关键一步,许多方法已经探索了不同变量之间的关系。然而,大多数现有方法并不考虑变量之间的异质性,即,不同类型的变量(连续数字变量、离散绝对变量或混合变量)可能有不同和独特的边缘分布。在本文件中,我们提议基于多边贸易体系的多种不同特征网络(HFN)的新型半监督异常检测框架,从大量未标定的时间序列数据中学习异性结构信息,以提高异常检测的准确性,并利用关注系数系数系数来解释所检测的异常性。具体地说,我们首先将传感器嵌入和感应值生成的相似值子谱混集在一起,以构建一个时间序列异谱,充分利用各种变量之间的丰富多变异性基线信息。随后,一个包含不标定时间序列数据的预测模型,从而向四种变异性滚动模型展示了我们的数据。