【VLDB2019】虚假新闻（Fake News）检测全面综述教程，156页PPT带你进入这一领域

2019 年 9 月 3 日 专知

【导读】VLDB 2019(Very Large Data Bases)是三大国际顶尖数据库会议之一(其余二者为SIGMOD和ICDE)，根据大会官方公布，今年VLDB共接收了128篇Research Paper、22篇IndustryPaper和48个Demo。VLDB 2019已经于2019.8.26-2019.8.30在洛杉矶召开。专知推荐在VLDB 2019上的一个虚假新闻（Fake News）检测的全面综述教程，本教程是数据库和社交网络领域的三位资深教授讲授，提供了最新关于Fake News相关技术的全景视图，包括检测，传播，缓解和假新闻的干预，教程涵盖了数据集成、真相发现与融合、概率数据库、知识图和假新闻视角下的众包等领域的研究，也是一个新兴研究方向。

内容简介

虚假新闻（Fake News）是对全球民主的一大威胁，导致人们对政府、新闻和社会的信任度下降。社交媒体和社交网络的大众普及，导致了假新闻的蔓延，阴谋论、虚假信息和极端观点在这些地方盛行。假新闻的发现与治理是当今时代的基本问题之一，受到了广泛的关注。尽管snopes、politifact等事实核查网站，以及谷歌、Facebook和Twitter等大公司，已经采取了初步措施来处理假新闻，但还有很多工作要做。作为一个跨学科的话题，从机器学习、数据库、新闻、政治学到其他很多领域，人们都在研究假新闻的各个方面。

本教程的目标有两个方面。首先，我们希望使数据库社区熟悉其他社区在打击假新闻方面所作的努力。我们提供有关各方面研究的最新技术的全景视图，包括检测，传播，缓解和假新闻的干预。

接下来，我们将简要而直观地总结数据库社区之前的研究，并讨论如何使用这些研究来抵消假新闻。

本教程涵盖了数据集成、真相发现与融合、概率数据库、知识图和假新闻视角下的众包等领域的研究。只有利用数据库和其他研究社区之间的协作关系，才能建立有效的工具来处理假新闻。我们希望我们的教程能够推动这种思想的综合和新思想的创造。

讲者

Laks V.S. Lakshmanan是英属哥伦比亚大学计算机科学系的教授。他是BC高级系统研究所的研究员，并于2016年11月被任命为ACM杰出科学家。他的研究兴趣涉及数据库系统和相关领域的广泛主题，包括:关系数据库和面向对象数据库, 高级数据模型新颖的应用程序,OLAP和数据仓库、数据挖掘、数据集成、半结构化数据和XML，社交网络和社交媒体，推荐系统和个性化。

Michael Simpson是英属哥伦比亚大学计算机科学系的博士后研究员。他在维多利亚大学获得博士学位。他的研究兴趣包括数据挖掘，社会网络分析，以及图形问题的可扩展算法的设计。

Saravanan（Sara）Thirumuruganathan是HBKU QCRI数据分析小组的科学家。他在德克萨斯大学阿灵顿分校获得博士学位。他对数据集成/清洗和用于数据管理的机器学习非常感兴趣。Saravanan的作品被选为VLDB 2018/2012年度最佳论文，并获得了SIGMOD 2018年度研究重点奖。

相关文献

Fake News Primer

References

Lui Guo and Chris Vargo, “Fake News” and Emerging Online Media Ecosystem: An Integrated Intermedia Agenda-Setting Analysis of the 2016 U.S. Presidential Election. Communications Research, June 2018.
Wu, Agrawal, Li, Yang, and Yu. Computational Fact-Checking through Query Perturbations. ACM TODS 2017.

Propagation of Fake News

References

David Kempe, Jon Kleinberg, and Eva Tardos. Maximizing the spread of influence through a social network. KDD 2003.
Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun Wang. Prominent features of rumor propagation in online social media. ICDM 2013.
Soroush Vosoughi, Deb Roy and Sinan Aral. The spread of true and false news online. Science 2018.
Xinyi Zhou, Reza Zafarani. Fake News: A Survey of Research, Detection Methods, and Opportunities. arXiv preprint. 2018.

Detection of Fake News

References (Data Integration, Truth Discovery & Fusion)

Jing Gao, Qi Li, Bo Zhao, Wei Fan, and Jiawei Han. Truth discovery and crowdsourcing aggregation: A unified perspective. PVLDB 2015.
Yannis Katsis, Yannis Papakonstantinou. View-based data integration. Encyclopedia of Database Systems. 2009.
Theodoros Rekatsinas, Manas Joglekar, Hector Garcia-Molina, Aditya Parameswaran, and Christopher Ré. Slimfast: Guaranteed results for data fusion and source reliability. SIGMOD 2017.
Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Kevin Murphy, Shaohua Sun, and Wei Zhang. From data fusion to knowledge fusion. PVLDB 2014.
Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. Integrating conflicting data: the role of source dependence. PVLDB 2009.

References (ML-based Detection)

Subhabrata Mukherjee and Gerhard Weikum. Leveraging Joint Interactions for Credibility Analysis in News Communities. CIKM 2015.
SVN Vishwanathan, Nicol N Schraudolph, Risi Kondor, and Karsten M Borgwardt. Graph kernels. JMLR 2010.
Xinyi Zhou, Reza Zafarani, Kai Shu, and Huan Liu. Fake news: Fundamental theories, detection strategies and challenges. WSDM 2019.
Xinyi Zhou, Reza Zafarani. Fake News: A Survey of Research, Detection Methods, and Opportunities. arXiv preprint. 2018.
Ke Wu, Song Yang, and Kenny Q. Zhu. False rumors detection on sina weibo by propagation structures." ICDE 2015.

References (Knowledge Graph-based Approaches )

Akrami, Farahnaz, et al. "Re-evaluating Embedding-Based Knowledge Graph Completion Methods." Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018.
Bordes, Antoine, et al. "Translating embeddings for modeling multi-relational data." Advances in neural information processing systems. 2013.
Chang, Lijun, et al. "Optimal enumeration: Efficient top-k tree matching." Proceedings of the VLDB Endowment 8.5 (2015): 533-544.
Cheng, Jiefeng, Xianggang Zeng, and Jeffrey Xu Yu. "Top-k graph pattern matching over large graphs." 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 2013.
Ciampaglia, Giovanni Luca, et al. "Computational fact checking from knowledge networks." PloS one 10.6 (2015): e0128193
Hamilton, Will, et al. "Embedding logical queries on knowledge graphs." Advances in Neural Information Processing Systems. 2018.
Jeh, Glen and Jennifer Widom. SimRank: a measure of structural-context similarity. KDD (2002).
Kazemi, Seyed Mehran, and David Poole. "Simple embedding for link prediction in knowledge graphs." Advances in Neural Information Processing Systems. 2018.
Lao, Ni, and William W. Cohen. "Relational retrieval using a combination of path-constrained random walks." Machine learning 81.1 (2010): 53-67.
Lin, Peng, et al. "Discovering graph patterns for fact checking in knowledge graphs." International Conference on Database Systems for Advanced Applications. Springer, Cham, 2018.
Lin, Yankai, et al. "Learning entity and relation embeddings for knowledge graph completion." Twenty-ninth AAAI conference on artificial intelligence. 2015.
Lü, Linyuan, Ci-Hang Jin, and Tao Zhou. "Similarity index based on local paths for link prediction of complex networks." Physical Review E 80.4 (2009): 046122.
Morales, Camilo, et al. "MateTee: A semantic similarity metric based on translation embeddings for knowledge graphs." International Conference on Web Engineering. Springer, Cham, 2017.
B. Shi and T. Weninger. Discriminative predicate path mining for fact checking in knowledge graphs. Knowledge-Based Sys., 104:123–133, 2016.
Shi, Baoxu, and Tim Weninger. "ProjE: Embedding projection for knowledge graph completion." Thirty-First AAAI Conference on Artificial Intelligence. 2017.
P. Shiralkar, A. Flammini, F. Menczer, and G. L. Ciampaglia. Finding streams in knowledge graphs to support fact checking. In 2017 IEEE ICDM 2017, pp 859–864, 2017.
Wang, Zhen, et al. "Knowledge graph embedding by translating on hyperplanes." Twenty-Eighth AAAI conference on artificial intelligence. 2014.
Xu, Zhongqi, Cunlai Pu, and Jian Yang. "Link prediction based on path entropy." Physica A: Statistical Mechanics and its Applications 456 (2016): 294-301.
Yang, Bishan, et al. "Embedding entities and relations for learning and inference in knowledge bases." arXiv preprint arXiv:1412.6575 (2014).
Yang, Shengqi, et al. "Schemaless and structureless graph querying." Proceedings of the VLDB Endowment 7.7 (2014): 565-576.
Yang, Shengqi, et al. "Fast top-k search in knowledge graphs." 2016 IEEE 32nd international conference on data engineering (ICDE). IEEE, 2016.

Mitigation and Intervention of Fake News

References

Bettencourt, Luís MA, et al. "The power of a good idea: Quantitative modeling of the spread of ideas from epidemiological models." Physica A: Statistical Mechanics and its Applications 364 (2006): 513-536.
Bharathi, Shishir, David Kempe, and Mahyar Salek. "Competitive influence maximization in social networks." International workshop on web and internet economics. Springer, Berlin, Heidelberg, 2007.
Budak, Ceren, Divyakant Agrawal, and Amr El Abbadi. "Limiting the spread of misinformation in social networks." Proceedings of the 20th international conference on World wide web. ACM, 2011.
Kempe, David, Jon Kleinberg, and Éva Tardos. "Maximizing the spread of influence through a social network." Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2003.
Khalil, Elias Boutros, Bistra Dilkina, and Le Song. "Scalable diffusion-aware optimization of network topology." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014.
Konstantinou, Loukas, Ana Caraban, and Evangelos Karapanos. "Combating Misinformation Through Nudging.” Co-Inform Project. 2019.
Medya, Sourav, Arlei Silva, and Ambuj Singh. "Influence Minimization Under Budget and Matroid Constraints: Extended Version." arXiv preprint arXiv:1901.02156 (2019).
Nguyen, Nam P., et al. "Containment of misinformation spread in online social networks." Proceedings of the 4th Annual ACM Web Science Conference. ACM, 2012.
Prakash, B. Aditya, et al. "Threshold conditions for arbitrary cascade models on arbitrary networks." Knowledge and information systems 33.3 (2012): 549-575.
Prakash, B. Aditya, et al. "Fractional immunization in networks." Proceedings of the 2013 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2013.
Tong, Guangmo, et al. "An efficient randomized algorithm for rumor blocking in online social networks." IEEE Transactions on Network Science and Engineering (2017).
Tong, Hanghang, et al. "On the vulnerability of large graphs." 2010 IEEE International Conference on Data Mining. IEEE, 2010.
Vo, Nguyen, and Kyumin Lee. "The rise of guardians: Fact-checking url recommendation to combat fake news." The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 2018
Zhang, Yao, and B. Aditya Prakash. "Dava: Distributing vaccines over networks under prior information." Proceedings of the 2014 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2014.
Zhang, Yao, and B. Aditya Prakash. "Data-aware vaccine allocation over large networks." ACM Transactions on Knowledge Discovery from Data (TKDD) 10.2 (2015): 20.
Zhao, Laijun, et al. "SIHR rumor spreading model in social networks." Physica A: Statistical Mechanics and its Applications 391.7 (2012): 2444-2453.

Future Opportunities

References

Lucas Graves. Understanding the promise and limits of automated fact-checking. Factsheet 2018.
Naeemul Hassan, Gensheng Zhang, Fatma Arslan, Josue Caraballo, Damian Jimenez, Siddhant Gawsane, Shohedul Hasan et al. ClaimBuster: the first-ever end-to-end fact-checking system. PVLDB 2017.
Rene Speck, Diego Esteves, Jens Lehmann, and Axel-Cyrille Ngonga Ngomo. Defacto-a multilingual fact validation interface. ISWC 2015.

原文链接：

https://combatingfakenewstutorial.github.io/vldb19.html

请关注专知公众号（点击上方蓝色专知关注）

后台回复“VLDB19FND” 就可以获取完整版PPT下载链接~

报告内容

请关注专知公众号（点击上方蓝色专知关注）

后台回复“VLDB19FND” 就可以获取完整版PPT下载链接~

-END-

专 · 知

专知，专业可信的人工智能知识分发，让认知协作更快更好！欢迎登录www.zhuanzhi.ai，注册登录专知，获取更多AI知识资料！

欢迎微信扫一扫加入专知人工智能知识星球群，获取最新AI专业干货知识教程视频资料和与专家交流咨询！

请加专知小助手微信（扫一扫如下二维码添加），加入专知人工智能主题群，咨询技术商务合作~

专知《深度学习:算法到实战》课程全部完成！560+位同学在学习，现在报名，限时优惠！网易云课堂人工智能畅销榜首位！

点击“阅读原文”，了解报名专知《深度学习:算法到实战》课程

登录查看更多

相关内容

VLDB

关注 18

VLDB是面向数据管理和数据库研究人员、供应商、从业人员、应用程序开发人员等用户的重要国际年度论坛。VLDB 2019会议将以研究报告，教程，演示和研讨会为特色。由于它们是21世纪新兴应用程序的技术基石，因此它将涵盖数据管理，数据库和信息系统研究中的问题。官网地址：http://dblp.uni-trier.de/db/conf/vldb/

【DeepMind推荐】居家学习的人工智能干货资源大全集

专知会员服务

110+阅读 · 2020年6月27日

少标签数据学习，54页ppt

专知会员服务

203+阅读 · 2020年5月22日

最新《Deepfakes：创造与检测》2020综述论文，36页pdf

专知会员服务

65+阅读 · 2020年5月15日

最新《深度学习行人重识别》综述论文，24页pdf

专知会员服务

81+阅读 · 2020年5月5日