因果发现概率自动编码器 (A probabilistic autoencoder for causal discovery)

The paper addresses the problem of finding the causal direction between two associated variables. The proposed solution is to build an autoencoder of their joint distribution and to maximize its estimation capacity relative to both the marginal distributions. It is shown that the resulting two capacities cannot, in general, be equal. This leads to a new criterion for causal discovery: the higher capacity is consistent with the unconstrained choice of a distribution representing the cause while the lower capacity reflects the constraints imposed by the mechanism on the distribution of the effect. Estimation capacity is defined as the ability of the auto-encoder to represent arbitrary datasets. A regularization term forces it to decide which one of the variables to model in a more generic way i.e., while maintaining higher model capacity. The causal direction is revealed by the constraints encountered while encoding the data instead of being measured as a property of the data itself. The idea is implemented and tested using a restricted Boltzmann machine.

翻译：文件探讨了在两个相关变量之间寻找因果方向的问题。提议的解决办法是建立一个联合分布的自动编码器,并尽量扩大相对于边际分布的估算能力。它表明,由此得出的两种能力一般不能相等。这导致一个新的因果发现标准:能力提高符合对代表原因的分布的不受限制的选择,而能力较低则反映了机制对影响分布的制约。估计能力被定义为自动编码器代表任意数据集的能力。一个正规化术语迫使它决定以比较通用的方式(即保持较高的模型能力)模式模式的哪一个变量。因果方向通过在将数据编码而不是作为数据本身的属性加以衡量时遇到的制约而得到揭示。这个想法是使用一个限制性的博尔茨曼机器来执行和测试的。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日