计算机经典算法回顾与展望——机器学习与数据挖掘

2019 年 10 月 11 日 中国计算机学会

本论坛将于 CNCC2019 中国计算机大会第二天（10月18日）在苏州金鸡湖国际会议中心 A303-304 会议室举行，共邀美国伊利诺伊大学芝加哥分校(UIC)、美国罗格斯－新泽西州立大学、微软亚洲研究院、北京大学、清华大学等国内外著名专家和学者与你一起分享和探讨机器学习与数据挖掘相关算法的昨天、今天和明天。

论坛简介：

近年来，机器学习与数据挖掘已成为一个计算机领域内一个热门方向。机器学习从样本数据中学习得到知识和规律，然后用于实际的推断和决策，是一种数据驱动的方法。数据挖掘则是指从数据库的大量数据中揭示出隐含的、先前未知的并有潜在价值的信息的非平凡过程。二者既有相似性又有区别。本论坛将聚焦机器学习与数据挖掘在学术界以及工业界的最新研究成果，同时回顾计算机发展近几十年来的经典算法，邀请领域内知名专家，回顾历史，展望未来，共同探讨机器学习与数据挖掘领域的新算法研究方向。

讲者和报告简介

俞士纶

演讲主题：Deep and Broad Learning on Detecting Neurological Disorder

报告摘要：Neurological disorder has affected a third of the population in the US and put an enormous strain to the health care system. Mining from neuroimaging data is becoming increasingly popular in the field of healthcare and bioinformatics, due to its potential to discover clinically meaningful structure patterns that could facilitate the understanding and diagnosis of neurological and neuropsychiatric disorders. Modern imaging techniques have allowed us to model the human brain as a network or graph. A brain connectivity network can be constructed from neuroimaging data, where the nodes of the network correspond to a set of brain regions and links represent the functional or structural connectivity between these regions. The linkage structure in brain networks can encode valuable information about the organizational properties of the human brain as a whole. Most recent research concentrates on applying subgraph mining techniques to discover connected subgraph patterns in the brain network. However, the underlying brain network structure is complicated. In this talk, we focus on how to learn representations that can capture the highly non-linearity of brain networks and preserve the underlying structures to detect neurological disorder.

个人简介：美国伊利诺伊大学芝加哥分校(UIC)特聘主任教授、美国计算机学会(ACM)及美国电气电子工程师学会(IEEE)院士(Fellow)，清华大学特聘教授。他曾于美国IBM Watson研究中心工作多年，创建了世界知名的数据挖掘及数据管理部，是IBM公司拥有专利最多的人之一。作为国际数据库和数据挖掘等领域的先驱之一，作为国际数据挖掘和数据管理领域的顶尖学者，曾担任多个著名国际期刊主编、副主编以及多个顶级国际学术会议的程序委员会主席和委员，在国际著名学术期刊与重要国际学术会议(如SIGKDD、SIGMOD,WWW、AAAI等)上发表论文970余篇，专利300余项，在谷歌学术上的H-index高达138。自1981-2018年Philip S. Yu的研究成果有1094项，2018年全球计算机科学和电子领域排名第九，华人排名第二。Philip S. Yu的主要研究兴趣包括数据挖掘、隐私保护发布和挖掘、数据流、数据库系统、互联网应用和技术、多媒体系统、并行和分布式处理以及性能建模。

熊辉

演讲主题：Classic Clustering Algorithms to Live By

报告摘要：Clustering is a traditional unsupervised learning task, which has a goal of grouping a set of objects in such a way that objects in the same cluster are more similar to each other than to those in other clusters. In this talk, we first introduce some classic clustering algorithms, such as K-means. The focus of this talk is to reveal why the original ideas of these clustering algorithms came from real life, and how these algorithms can be applied to our everyday lives, helping to make decisions in a more effective way.

个人简介：美国罗格斯－新泽西州立大学终身正教授、RBS院长讲席教授；目前学术休假担任百度商业智能实验室和百度人才智库主任。熊辉教授主要研究领域涵盖数据挖掘、商业智能、以及管理大数据。他获得的部分荣誉包括ACM杰出科学家、中国教育部长江讲座教授、中国国家基金委海外杰青B类（海外及港澳学者合作研究基金）、哈佛商业评论2018年“拉姆.查兰管理实践奖”-全场大奖、2017 IEEE ICDM Outstanding Service Award、和ICDM-2011最佳研究论文奖。熊辉教授是Encyclopedia of GIS (Springer)的共同主编，IEEE Transactions on Big Data (TBD)、 ACM Transactions on Knowledge Discovery from Data (TKDD) 和 ACM Transactions on Management Information Systems (TMIS)的编委。曾担任ACM KDD 2012企业及政府专题的共同程序委员会主席、2018中国大数据技术大会共同大会主席、IEEE ICDM 2013的共同程序委员会主席、IEEE ICDM 2015的共同大会主席，以及ACM KDD-2018的研究专题程序委员会主席。

王立威

演讲主题：机器学习经典——理论与算法

报告摘要：此报告中我将介绍机器学习中的经典理论，以及受这些理论的启发而设计的两个经典算法SVM与Boosting。我将介绍算法思想的起源与发展，并探讨在应用中如何选择合适的算法。最后还将分享我对深度学习与理论关系的看法。

个人简介：北京大学信息科学技术学院教授。主要从事机器学习理论研究。在机器学习国际权威期刊会议发表高水平论文100余篇。担任机器学习与计算机视觉顶级期刊IEEE TPAMI编委。多次担任国际机器学习旗舰会议NeurIPS (NIPS)与ICML领域主席。入选AI’s 10 to Watch，是该奖项自设立以来首位获此荣誉的中国学者。获得首届国家自然科学基金优秀青年基金。带领团队获得首届天池AI医疗大赛决赛冠军。

陈卫

演讲主题：Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context

报告摘要：Influence maximization is the task of selecting k seed nodes in a social network such that the influence spread of the seeds is maximized. It models the viral marketing scenario, and can also be applied to other scenarios such as cascade monitoring and rumor control. Since proposed in 2003, influence maximization and its variants have been extensively studied, and the area is still actively growing. Influence maximization is also a nice demonstration of how classical algorithms could be integrated into the social network context. In this talk, I will first introduce the core research problems and major results in influence maximization. Then through several examples, I will demonstrate how classical algorithms, such as the greedy algorithm, Dijkstra’s shortest path algorithm, UCB for multi-armed bandit are integrated into influence maximization algorithms, and how new research challenges are raised during this integration and how we address these challenges, and in some cases by expanding the classical algorithms to fit into the new settings.

个人简介：Wei Chen is a Principal Researcher at Microsoft Research Asia, an Adjunct Professor at Tsinghua University, and an Adjunct Researcher at Chinese Academy of Sciences. His main research interests include social and information networks, online learning, network game theory and economics, distributed computing, and fault tolerance. He has conducted extensive research work on the modeling and algorithmic studies on information and influence propagation in social networks, with a series of publications in top conferences and journals, which receive aggregate citations of more than 6600 times. He coauthored a monograph in 2013, “Information and Influence Propagation in Social Networks”, and is the sole author of the upcoming monograph “Big Data Network Diffusion Models and Algorithms” (in Chinese). He is the member of CCF Task Force on Big Data and Technical Committee on Theoretical Computer Science. He has served on the program committees of many top conferences in data mining, machine learning, and artificial intelligence. Wei Chen has Bachelor and Master degrees in computer science from Tsinghua University and a Ph.D. degree in computer science from Cornell University. For more information, you are welcome to visit his home page at http://research.microsoft.com/en-us/people/weic/.

唐杰

演讲主题：图神经网络 (GNN) 算法及其应用

报告摘要：图神经网络将深度学习方法延伸到非欧几里得的图数据上，大大提高了图数据应用的精度。在这个报告中，我将简单回顾一下图卷积网络（GCN）并探讨如何提高GCN在图数据上的表示学习能力。我们的研究发现几个巧妙、简单的方法可以有效的提高GCN的表示能力，该方法可以等价表示为图注意力网络（GAT）。该方法的有效性在包括阿里巴巴等多个超大规模数据集上得到验证。

个人简介：清华大学计算机系教授、系副主任，获杰青。研究兴趣包括：数据挖掘、社交网络和知识图谱。发表论文200余篇，引用10000余次（个人h-指数57）。主持研发了研究者社会网络挖掘系统AMiner，吸引了220个国家/地区1000多万独立IP访问。曾担任国际期刊ACM TKDD的执行主编和国际会议CIKM’16、WSDM’15的PC Chair、KDD’18大会副主席。作为第1完成人获北京市科技进步一等奖、中国人工智能学会科技进步一等奖、KDD杰出贡献奖。

论坛主席

林俊宇

个人简介：博士，副研究员，中国科学院信息工程研究所博士后，中科院信工所网络空间技术实验室主任助理，兼任CCF理事，高级会员， YOCSEF总部AC委员，计算机应用专委会常务委员，青工委委员，主要研究方向：网络安全，未来网络，知识工程。目前在研国家自然科学基金项目1项，省基金1项，横向项目8项。获省部级科技进步二等奖1项，科技发明类二等奖1项；申请获得授权发明专利24项，软件著作权4项。在包括WWW、TIP、IEEE和软件学报等国内外顶级期刊和会议上公开发表学术论文50余篇。

唐杰

个人简介：清华大学计算机系教授、系副主任，获杰青。研究兴趣包括：数据挖掘、社交网络和知识图谱。发表论文200余篇，引用10000余次（个人h-指数57）。主持研发了研究者社会网络挖掘系统AMiner，吸引了220个国家/地区1000多万独立IP访问。曾担任国际期刊ACM TKDD的执行主编和国际会议CIKM’16、WSDM’15的PC Chair、KDD’18大会副主席。作为第1完成人获北京市科技进步一等奖、中国人工智能学会科技进步一等奖、KDD杰出贡献奖。