Social networks are frequently polluted by rumors, which can be detected by advanced models such as graph neural networks. However, the models are vulnerable to attacks and understanding the vulnerabilities is critical to rumor detection in practice. To discover subtle vulnerabilities, we design a powerful attacking algorithm to camouflage rumors in social networks based on reinforcement learning that can interact with and attack any black-box detectors. The environment has exponentially large state spaces, high-order graph dependencies, and delayed noisy rewards, making the state-of-the-art end-to-end approaches difficult to learn features as large learning costs and expressive limitation of graph deep models. Instead, we design domain-specific features to avoid learning features and produce interpretable attack policies. To further speed up policy optimization, we devise: (i) a credit assignment method that decomposes delayed rewards to atomic attacking actions proportional to the their camouflage effects on target rumors; (ii) a time-dependent control variate to reduce reward variance due to large graphs and many attacking steps, supported by the reward variance analysis and a Bayesian analysis of the prediction distribution. On three real world datasets of rumor detection tasks, we demonstrate: (i) the effectiveness of the learned attacking policy compared to rule-based attacks and current end-to-end approaches; (ii) the usefulness of the proposed credit assignment strategy and variance reduction components; (iii) the interpretability of the policy when generating strong attacks via the case study.
翻译:社会网络经常受到流言的污染,这些流言可以通过古形神经网络等先进模型探测出来,然而,这些模型很容易受到攻击,了解脆弱性对于实际中发现流言至关重要。为了发现微妙的脆弱性,我们设计了强大的攻击算法,在强化学习的基础上,在社交网络中掩盖流言,这种算法可以与黑箱探测器发生互动并攻击任何黑箱探测器。环境具有指数化的庞大国家空间、高阶图形依赖性以及延迟的噪音奖励,使得最先进的端到端方法难以学习诸如高额学习成本和图表深度模型的明显限制等特征。相反,我们设计了具体领域特征,以避免学习特征并产生可解释的攻击政策。为了进一步加快政策优化,我们设计了:(一) 信用分配方法,根据对原子攻击行动造成的伪装对目标谣言的影响,对延迟的奖励进行分化;(二) 以时间为主的控制变差,以减少因大额图表和许多攻击步骤而出现的奖励差异,并辅之以差异分析以及对图表深度模型分布的分析。相反,我们设计了三个真实的世界数据组别,用来查找的、可解释的任务,我们演示了通过攻击的准确性任务。