Illicit drug trafficking via social media sites such as Instagram has become a severe problem, thus drawing a great deal of attention from law enforcement and public health agencies. How to identify illicit drug dealers from social media data has remained a technical challenge due to the following reasons. On the one hand, the available data are limited because of privacy concerns with crawling social media sites; on the other hand, the diversity of drug dealing patterns makes it difficult to reliably distinguish drug dealers from common drug users. Unlike existing methods that focus on posting-based detection, we propose to tackle the problem of illicit drug dealer identification by constructing a large-scale multimodal dataset named Identifying Drug Dealers on Instagram (IDDIG). Totally nearly 4,000 user accounts, of which over 1,400 are drug dealers, have been collected from Instagram with multiple data sources including post comments, post images, homepage bio, and homepage images. We then design a quadruple-based multimodal fusion method to combine the multiple data sources associated with each user account for drug dealer identification. Experimental results on the constructed IDDIG dataset demonstrate the effectiveness of the proposed method in identifying drug dealers (almost 95% accuracy). Moreover, we have developed a hashtag-based community detection technique for discovering evolving patterns, especially those related to geography and drug types.
翻译:通过Instagram等社交媒体网站非法贩毒已成为一个严重问题,因此引起执法和公共卫生机构的大量关注。如何从社交媒体数据中查明非法药物交易商仍然是一项技术挑战。一方面,由于社会媒体网站繁忙的隐私问题,现有数据有限;另一方面,毒品交易模式的多样性使得难以可靠地区分毒品交易商和普通吸毒者。与侧重于张贴检测的现有方法不同,我们提议通过建立一个名为识别Instagram(IDDIG)的大型多式联运数据集来解决非法药物交易商识别问题。总共从Instagram收集了近4,000个用户账户,其中超过1,400个是毒品交易商,从Instagram收集了多种数据来源,包括事后评论、邮寄图像、主页生物和主页图像。然后,我们设计了一种四重的基于多式联运的聚合方法,将每个用户账户中与毒品交易商识别有关的多个数据源结合起来。在构建的IDDIG数据集的实验结果中展示了确定毒品交易商的拟议方法的有效性(近95%是毒品交易商,其中1,1,400个是毒品交易商,其中1,400个是毒品交易商,我们开发了一种与地理特征相关的技术。此外,我们还开发了一种方法,特别是了一种与地理特征。