In this paper we introduce a novel Bayesian approach for linking multiple social networks in order to discover the same real world person having different accounts across networks. In particular, we develop a latent model that allow us to jointly characterize the network and linkage structures relying in both relational and profile data. In contrast to other existing approaches in the machine learning literature, our Bayesian implementation naturally provides uncertainty quantification via posterior probabilities for the linkage structure itself or any function of it. Our findings clearly suggest that our methodology can produce accurate point estimates of the linkage structure even in the absence of profile information, and also, in an identity resolution setting, our results confirm that including relational data into the matching process improves the linkage accuracy. We illustrate our methodology using real data from popular social networks such as Twitter, Facebook, and YouTube.
翻译:在本文中,我们引入了一种新颖的Bayesian方法,将多个社交网络连接起来,以便发现同一个真实世界的人在网络上有着不同的账户,特别是,我们开发了一个潜在模式,使我们能够共同确定网络和联系结构的特点,同时依赖关系数据和剖面图数据。与机器学习文献中的其他现有方法相比,我们的Bayesian实施过程自然通过事后概率为联系结构本身或其任何功能提供不确定的量化。我们的研究结果清楚地表明,即使没有剖面图信息,我们的方法也能对联系结构得出准确的点数估计,而且在身份分辨率设定中,我们的结果证实,将相关数据纳入匹配过程可以提高联系的准确性。我们用诸如Twitter、Facebook和YouTube等流行的社会网络提供的真实数据来说明我们的方法。