检测低资源语言的社会媒体操纵 (Detecting Social Media Manipulation in Low-Resource Languages)

Social media have been deliberately used for malicious purposes, including political manipulation and disinformation. Most research focuses on high-resource languages. However, malicious actors share content across countries and languages, including low-resource ones. Here, we investigate whether and to what extent malicious actors can be detected in low-resource language settings. We discovered that a high number of accounts posting in Tagalog were suspended as part of Twitter's crackdown on interference operations after the 2016 US Presidential election. By combining text embedding and transfer learning, our framework can detect, with promising accuracy, malicious users posting in Tagalog without any prior knowledge or training on malicious content in that language. We first learn an embedding model for each language, namely a high-resource language (English) and a low-resource one (Tagalog), independently. Then, we learn a mapping between the two latent spaces to transfer the detection model. We demonstrate that the proposed approach significantly outperforms state-of-the-art models, including BERT, and yields marked advantages in settings with very limited training data-the norm when dealing with detecting malicious activity in online platforms.

翻译：社会媒体被故意用于恶意目的,包括政治操纵和虚假信息。大多数研究都侧重于高资源语言。但是,恶意行为者在不同国家和语言中共享内容,包括低资源语言。在这里,我们调查在低资源语言环境中是否以及在何种程度上可以检测到恶意行为者。我们发现,在2016年美国总统大选后,Tagalog 上的大量账户被中止,作为Twitter打击干扰行动的一部分。通过将文字嵌入和传输学习结合起来,我们的框架可以预见到恶意用户在Tagalog上发布,没有事先任何知识或关于恶意内容的培训。我们首先学习了每种语言的嵌入模式,即高资源语言(英语)和低资源语言(Tagalog ) 。然后,我们学到了两个潜在空间之间的地图,以传输检测模式。我们证明,拟议的方法大大超越了包括BERT在内的最新技术模式,并在处理网上平台恶意活动时,在培训数据规范非常有限的环境中产生显著的优势。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/