在推特上发现恶意政治活动的高级别办法 (High-level Approaches to Detect Malicious Political Activity on Twitter)

Our work represents another step into the detection and prevention of these ever-more present political manipulation efforts. We, therefore, start by focusing on understanding what the state-of-the-art approaches lack -- since the problem remains, this is a fair assumption. We find concerning issues within the current literature and follow a diverging path. Notably, by placing emphasis on using data features that are less susceptible to malicious manipulation and also on looking for high-level approaches that avoid a granularity level that is biased towards easy-to-spot and low impact cases. We designed and implemented a framework -- Twitter Watch -- that performs structured Twitter data collection, applying it to the Portuguese Twittersphere. We investigate a data snapshot taken on May 2020, with around 5 million accounts and over 120 million tweets (this value has since increased to over 175 million). The analyzed time period stretches from August 2019 to May 2020, with a focus on the Portuguese elections of October 6th, 2019. However, the Covid-19 pandemic showed itself in our data, and we also delve into how it affected typical Twitter behavior. We performed three main approaches: content-oriented, metadata-oriented, and network interaction-oriented. We learn that Twitter's suspension patterns are not adequate to the type of political trolling found in the Portuguese Twittersphere -- identified by this work and by an independent peer - nor to fake news posting accounts. We also surmised that the different types of malicious accounts we independently gathered are very similar both in terms of content and interaction, through two distinct analysis, and are simultaneously very distinct from regular accounts.

翻译：我们的工作代表了发现和预防这些日益流行的政治操纵行为的另一个步骤。因此,我们首先侧重于了解最先进的方法缺乏什么 -- -- 因为问题依然存在,这是一个公平的假设。我们发现当前文献中的问题,并遵循不同的路径。值得注意的是,我们强调使用不易恶意操纵的数据特征,并寻求避免倾向于容易发生和低影响案例的粒子水平的高级方法。我们设计并实施了一个框架 -- -- Twitter观察 -- -- 进行结构性的Twitter数据收集,将其应用到葡萄牙的Twitter圈。我们同时调查了2020年5月的一个数据快照,约有500万个账户,超过1.2亿个推特(这一价值自此上升至超过1.75亿次 ) 。经过分析的时间段从2019年8月到2020年5月的葡萄牙选举。然而,Covid-19大流行病在我们的数据中表现出来,我们还探索了它是如何影响典型的Twitter行为。我们进行了三种主要的方法:以内容为导向的、以元数据为导向的、网络的升级,我们通过不同的Twitter模式来学习了一种不同的在线模式。