In this paper, we investigate the issue of detecting the real-life influence of people based on their Twitter account. We propose an overview of common Twitter features used to characterize such accounts and their activity, and show that these are inefficient in this context. In particular, retweets and followers numbers, and Klout score are not relevant to our analysis. We thus propose several Machine Learning approaches based on Natural Language Processing and Social Network Analysis to label Twitter users as Influencers or not. We also rank them according to a predicted influence level. Our proposals are evaluated over the CLEF RepLab 2014 dataset, and outmatch state-of-the-art ranking methods.
翻译:在本文中,我们根据Twitter账户调查发现人们真实生活影响的问题,我们建议概述用于描述这些账户及其活动的通用推特功能,并表明这些功能在这方面效率低下。特别是,retweets和追随者人数以及Klout评分与我们的分析无关。因此,我们建议采用基于自然语言处理和社会网络分析的若干机械学习方法,将Twitter用户贴上“影响者”或“影响者”的标签。我们还根据预期影响程度对其进行排序。我们的提议是通过CLEF RepLab 2014数据集和超前最先进的排名方法进行评估的。