The deluge of digital information in our daily life -- from user-generated content, such as microblogs and scientific papers, to online business, such as viral marketing and advertising -- offers unprecedented opportunities to explore and exploit the trajectories and structures of the evolution of information cascades. Abundant research efforts, both academic and industrial, have aimed to reach a better understanding of the mechanisms driving the spread of information and quantifying the outcome of information diffusion. This article presents a comprehensive review and categorization of information popularity prediction methods, from feature engineering and stochastic processes, through graph representation, to deep learning-based approaches. Specifically, we first formally define different types of information cascades and summarize the perspectives of existing studies. We then present a taxonomy that categorizes existing works into the aforementioned three main groups as well as the main subclasses in each group, and we systematically review cutting-edge research work. Finally, we summarize the pros and cons of existing research efforts and outline the open challenges and opportunities in this field.
翻译:在我们日常生活中,从诸如微博客和科学论文等用户生成的内容到病毒营销和广告等在线商业等数字信息的范围之大,为探索和利用信息级联演变的轨迹和结构提供了前所未有的机会。大量的学术和工业研究努力旨在更好地了解推动信息传播和量化信息传播结果的机制。本篇文章全面审查和分类信息普及率预测方法,从特征工程和随机学过程,通过图表表述,到深层次的学习方法。具体地说,我们首先正式界定不同类型的信息级联,总结现有研究的观点。然后我们提出一种分类方法,将现有作品分为上述三大类以及每个类的主要子类,我们系统地审查尖端研究工作。最后,我们总结现有研究工作的利弊,并概述该领域的公开挑战和机遇。