Identifying influencers in a given social network has become an important research problem for various applications, including accelerating the spread of information in viral marketing and preventing the spread of fake news and rumors. The literature contains a rich body of studies on identifying influential source spreaders who can spread their own messages to many other nodes. In contrast, the identification of influential brokers who can spread other nodes' messages to many nodes has not been fully explored. Theoretical and empirical studies suggest that involvement of both influential source spreaders and brokers is a key to facilitating large-scale information diffusion cascades. Therefore, this paper explores ways to identify influential brokers from a given social network. By using three social media datasets, we investigate the characteristics of influential brokers by comparing them with influential source spreaders and central nodes obtained from centrality measures. Our results show that (i) most of the influential source spreaders are not influential brokers (and vice versa) and (ii) the overlap between central nodes and influential brokers is small (less than 15%) in Twitter datasets. We also tackle the problem of identifying influential brokers from centrality measures and node embeddings, and we examine the effectiveness of social network features in the broker identification task. Our results show that (iii) although a single centrality measure cannot characterize influential brokers well, prediction models using node embedding features achieve F$_1$ scores of 0.35--0.68, suggesting the effectiveness of social network features for identifying influential brokers.
翻译:在特定社会网络中确定影响者已成为各种应用的重要研究问题,包括加速传播病毒营销信息并防止传播假新闻和流言,文献中载有大量关于识别有影响力的源传播者并将自己的信息传播给许多其他节点的研究。相反,尚未充分探讨能够将其他节点信息传播给许多节点的有影响力的经纪人的识别问题。理论和经验研究表明,有影响力的源传播者和经纪人的参与是促进大规模信息传播链的关键。因此,本文探讨了如何从某个社会网络中识别有影响力的经纪人。通过使用三个社交媒体数据集,我们通过将有影响力的源传播者与从核心措施中获得的中央节点进行比较,对有影响力的传播者的特点进行调查。我们的结果表明:(一) 大部分有影响力的源传播者不是有影响力的中间人(反之),以及(二) 中央节点和有影响力的经纪人在Twitter数据集中的重叠程度很小(低于15%)。我们还探讨了如何从核心措施和节点嵌中找出有影响力的经纪人的问题。我们用三种社交媒体数据集的特征来调查有影响力的特性,但我们用具有影响力的模板的特性显示有影响力的特性。