Today's social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social networks. Hence, in this paper, we argue for best-effort rumour detection that detects most rumours quickly rather than all rumours with a high delay. To this end, we combine techniques for efficient, graph-based matching of rumour patterns with effective load shedding that discards some of the input data while minimising the loss in accuracy. Experiments with large-scale real-world datasets illustrate the robustness of our approach in terms of runtime performance and detection accuracy under diverse streaming conditions.
翻译:今天的社会网络不断产生大量的数据流,这些数据流一旦开始传播,就为发现谣言提供了宝贵的起点。然而,由于社交网络所排放的高高速流数据数量庞大,因此,鉴于当代算法无法满足,因此,在今天的社会网络中,我们主张以最努力的方式探测谣言,快速地发现大多数谣言,而不是拖延时间过久的所有谣言。为此,我们把高效、基于图表的谣言模式与有效排泄的遗漏进行匹配的技术结合起来,从而抛弃一些输入数据,同时尽可能减少准确性的损失。 大规模真实世界数据集的实验显示了我们在不同流传条件下的运行性能和探测准确性方法的稳健性。