打击网上社会网络中的欺诈:发现像农场一样的隐形脸书 (Combating Fraud in Online Social Networks: Detecting Stealthy Facebook Like Farms)

As businesses increasingly rely on social networking sites to engage with their customers, it is crucial to understand and counter reputation manipulation activities, including fraudulently boosting the number of Facebook page likes using like farms. To this end, several fraud detection algorithms have been proposed and some deployed by Facebook that use graph co-clustering to distinguish between genuine likes and those generated by farm-controlled profiles. However, as we show in this paper, these tools do not work well with stealthy farms whose users spread likes over longer timespans and like popular pages, aiming to mimic regular users. We present an empirical analysis of the graph-based detection tools used by Facebook and highlight their shortcomings against more sophisticated farms. Next, we focus on characterizing content generated by social networks accounts on their timelines, as an indicator of genuine versus fake social activity. We analyze a wide range of features extracted from timeline posts, which we group into two main classes: lexical and non-lexical. We postulate and verify that like farm accounts tend to often re-share content, use fewer words and poorer vocabulary, and more often generate duplicate comments and likes compared to normal users. We extract relevant lexical and non-lexical features and and use them to build a classifier to detect like farms accounts, achieving significantly higher accuracy, namely, at least 99% precision and 93% recall.

翻译：由于企业越来越多地依赖社交网络网站与客户接触,因此了解和抵制信誉操纵活动至关重要,包括欺诈性地增加使用类似农场的Facebook网页数量。为此,已经提出若干欺诈检测算法,有些由Facebook部署,使用图表组合,区分真正的相似之处和农场控制的特征。然而,正如我们在本文件中显示的那样,这些工具对用户喜欢在较长的时间跨度和广受欢迎的网页上散布的隐形农场不起作用,目的是模仿普通用户。我们对脸书使用的基于图表的检测工具进行了实证分析,并突出其针对更先进的农场的缺点。我们随后侧重于将社会网络账户生成的内容定性为其时间表,以显示真实的社会活动与虚假的社会活动。我们分析了从时间表文章中提取的多种特征,我们将其分为两大类:词汇和非传统类。我们把这些工具改写和核实一下,像农场账户一样,往往重复内容,使用更少的词汇和更简便的词汇,并比普通用户更经常生成重复的评论和类似的东西。我们把社会网络账户的特征集中化,在正常用户身上进行大幅的分类和排序。我们从39的精确度上提取了一个相关的分类和不精确性特征,从而测量了39的精确性地测量了它们。

相关内容

Facebook

关注 29

Facebook 是一个社交网络服务网站，于 2004 年 2 月 4 日上线。从 2006 年 9 月到 2007 年 9 月间，该网站在全美网站中的排名由第 60 名上升至第 7 名。同时 Facebook 是美国排名第一的照片分享站点。 2012年 2 月 1 日，Facebook向美国证券交易委员会提交集资规模为 50 亿美元的上市申请。

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【清华腾讯-AAAI2020】双向图卷积神经网络谣言检测，Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks

专知会员服务

70+阅读 · 2020年1月20日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日