Graph anomaly detection in this paper aims to distinguish abnormal nodes that behave differently from the benign ones accounting for the majority of graph-structured instances. Receiving increasing attention from both academia and industry, yet existing research on this task still suffers from two critical issues when learning informative anomalous behavior from graph data. For one thing, anomalies are usually hard to capture because of their subtle abnormal behavior and the shortage of background knowledge about them, which causes severe anomalous sample scarcity. Meanwhile, the overwhelming majority of objects in real-world graphs are normal, bringing the class imbalance problem as well. To bridge the gaps, this paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs, equipped with three specially designed modules: 1) an information fusion module employing graph neural network encoders to learn representations, 2) a graph data augmentation module that fertilizes the training set with generated samples, and 3) an imbalance-tailored learning module to discriminate the distributions of the minority (anomalous) and majority (normal) classes. A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics, together with an extensive ablation study validating the strength of our proposed modules.
翻译:本文的图表异常检测旨在区分异常节点,其行为与占图表结构多数的图例的良性节点不同。 得到学术界和工业界越来越多的关注,然而,在从图形数据中学习信息异常行为时,关于这项任务的现有研究仍然有两个关键问题。 首先,异常通常很难捕捉,因为其微妙的异常行为和缺乏关于异常现象的背景知识,造成异常样本严重稀缺。 同时,真实世界图中绝大多数对象的异常节点是正常的,也带来了阶级不平衡问题。为了弥合差距,本文设计了一个基于数据增强的图表异常探测(DAGAAD)新颖框架,配有三个专门设计的模块:(1) 信息聚合模块,使用图形神经网络的聚合模块来学习演示,(2) 图形数据增强模块,用生成的样本来对培训集进行肥化,以及(3) 一种不平衡的学习模块,以区分少数( 异常) 和多数( 正常) 类的分布。 在三个数据采集的图表上的一系列实验证明,DAGADADGM 的模型模型将多数用于有效的基准级的模型。