As the field of machine learning for combinatorial optimization advances, traditional problems are resurfaced and readdressed through this new perspective. The overwhelming majority of the literature focuses on small graph problems, while several real-world problems are devoted to large graphs. Here, we focus on two such problems: influence estimation, a #P-hard counting problem, and influence maximization, an NP-hard problem. We develop GLIE, a Graph Neural Network (GNN) that inherently parameterizes an upper bound of influence estimation and train it on small simulated graphs. Experiments show that GLIE provides accurate influence estimation for real graphs up to 10 times larger than the train set. More importantly, it can be used for influence maximization on considerably larger graphs, as the predictions ranking is not effected by the drop of accuracy. We develop a version of Cost Effective Lazy Forward optimization with GLIE instead of simulated influence estimation, surpassing the benchmark for influence maximization, although with a computational overhead. To balance the time complexity and quality of influence, we propose two different approaches. The first is a Q-network that learns to choose seeds sequentially using GLIE's predictions. The second defines a provably submodular function based on GLIE's representations to rank nodes fast while building the seed set. The latter provides the best combination of time efficiency and influence spread, outperforming SOTA benchmarks.
翻译:作为机械学习组合优化进步的机械化领域,传统问题会重新出现,并通过这种新视角重新解决。绝大多数文献都侧重于小图表问题,而几个现实世界问题则专注于大图表。这里,我们集中关注两个这样的问题:影响估算、#P-硬计问题和影响最大化、NP-硬问题。我们开发了GLIE,即图神经网络(GNN),其内在参数是影响估计的上限,并将其用小型模拟图表进行培训。实验显示,GLIE为实际图表提供了精确的影响估计,其范围比火车机组大10倍。更重要的是,它可用于影响大得多的图表的最大化。因为预测不是由精确度下降来进行的,因此我们集中关注两个这样的问题:影响估算:影响估算,#P-硬计算问题,以及影响最大化,而影响最大。我们开发了GLIE(GLIE)的成本效益,而不是模拟影响估计,超越了影响最大化的基准,尽管是计算性的组合。为了平衡时间复杂性和影响力的质量,我们提出了两种不同的办法。首先,建立Q-Q-B网络,用来对远大得多的种子预测功能进行精确的排序,然后根据GLAFIE的预测,然后进行最精确的排序来定义。