Malicious social bots achieve their malicious purposes by spreading misinformation and inciting social public opinion, seriously endangering social security, making their detection a critical concern. Recently, graph-based bot detection methods have achieved state-of-the-art (SOTA) performance. However, our research finds many isolated and poorly linked nodes in social networks, as shown in Fig.1, which graph-based methods cannot effectively detect. To address this problem, our research focuses on effectively utilizing node semantics and network structure to jointly detect sparsely linked nodes. Given the excellent performance of language models (LMs) in natural language understanding (NLU), we propose a novel social bot detection framework LGB, which consists of two main components: language model (LM) and graph neural network (GNN). Specifically, the social account information is first extracted into unified user textual sequences, which is then used to perform supervised fine-tuning (SFT) of the language model to improve its ability to understand social account semantics. Next, the semantically enriched node representation is fed into the pre-trained GNN to further enhance the node representation by aggregating information from neighbors. Finally, LGB fuses the information from both modalities to improve the detection performance of sparsely linked nodes. Extensive experiments on two real-world datasets demonstrate that LGB consistently outperforms state-of-the-art baseline models by up to 10.95%. LGB is already online: https://botdetection.aminer.cn/robotmain.
翻译:暂无翻译