The presence of a large number of bots in Online Social Networks (OSN) leads to undesirable social effects. Graph neural networks (GNNs) have achieved state-of-the-art performance in bot detection since they can effectively utilize user interaction. In most scenarios, the distribution of bots and humans is imbalanced, resulting in under-represent minority class samples and sub-optimal performance. However, previous GNN-based methods for bot detection seldom consider the impact of class-imbalanced issues. In this paper, we propose an over-sampling strategy for GNN (OS-GNN) that can mitigate the effect of class imbalance in bot detection. Compared with previous over-sampling methods for GNNs, OS-GNN does not call for edge synthesis, eliminating the noise inevitably introduced during the edge construction. Specifically, node features are first mapped to a feature space through neighborhood aggregation and then generated samples for the minority class in the feature space. Finally, the augmented features are fed into GNNs to train the classifiers. This framework is general and can be easily extended into different GNN architectures. The proposed framework is evaluated using three real-world bot detection benchmark datasets, and it consistently exhibits superiority over the baselines.
翻译:在线社会网络(OSN)中存在大量机器人,这会导致不良的社会效应。图形神经网络(GNN)在机器人检测方面达到了最先进的性能,因为它们能够有效利用用户互动。在多数情况下,机器人和人类的分布不平衡,导致少数类样本和次优性能存在不足。然而,以前的基于GNN的机器人检测方法很少考虑阶级平衡问题的影响。在本文中,我们提议了一种为GNN(OS-GNN)过度采样的战略,可以减轻机器人检测中阶级不平衡的影响。与以前对GNNS的过度采样方法相比,OS-GNNN并不要求进行边缘合成,消除边缘构造期间不可避免地引入的噪音。具体地说,节点特征首先通过邻居集成绘制到特征空间的地貌空间,然后为地貌空间的少数群体生成样本。最后,增强的特征被注入到GNNNN(OS-GNN)中,以培训分类者。这个框架是一般的,并且可以很容易扩展为不同的GNNNNS基准,并且能够持续地扩展到不同的GNNS基准。拟议框架。