因果提升和连结预测 (Causal Lifting and Link Prediction)

Current state-of-the-art causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent, i.e., the outcome of link interventions depends on existing links. For instance, in the customer-product graph of an online retailer, the effect of an 85-inch TV ad (treatment) likely depends on whether the costumer already has an 85-inch TV. Unfortunately, existing causal methods are impractical in these scenarios. The cascading functional dependencies between links (due to path dependence) are either unidentifiable or require an impractical number of control variables. In order to remedy this shortcoming, this work develops the first causal model capable of dealing with path dependencies in link prediction. It introduces the concept of causal lifting, an invariance in causal models that, when satisfied, allows the identification of causal link prediction queries using limited interventional data. On the estimation side, we show how structural pairwise embeddings -- a type of symmetry-based joint representation of node pairs in a graph -- exhibit lower bias and correctly represent the causal structure of the task, as opposed to existing node embedding methods, e.g., GNNs and matrix factorization. Finally, we validate our theoretical findings on four datasets under three different scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumer-product recommendations.

翻译：目前最先进的连接预测因果模型假定了一套内在的内在节点因素 -- -- 节点诞生时所定义的内在特征 -- -- 指导图中链接因果演变的内在特征。然而,在某些因果任务中,链接的形成取决于路径,即链接干预的结果取决于现有的链接。例如,在网上零售商的客户产品图中,85英寸电视广告(处理)的影响可能取决于服装装配者是否已经拥有85英寸的电视。不幸的是,在这些情景中,现有的因果方法是不切实际的。由于路径依赖性原因,链接之间的因果关系的功能依赖性要么无法识别,要么需要不切实际的控制变量数。为了弥补这一缺陷,这项工作开发了第一个能够处理连接预测中路径依赖性的因果关系的因果模型。它引入了因果提升概念,一种因果模型的变异性,当满足时,允许使用有限的干预性数据来识别因果联系预测查询。在估算方面,我们展示了结构对等化的内嵌化 e 基数(由于路径依赖性决定), 基数基数不切不切不切不切入, 基数基数。为了纠正 G 基数基基结构基数结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构结构

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日