Recent years have seen rapid progress at the intersection between causality and machine learning. Motivated by scientific applications involving high-dimensional data, in particular in biomedicine, we propose a deep neural architecture for learning causal relationships between variables from a combination of empirical data and prior causal knowledge. We combine convolutional and graph neural networks within a causal risk framework to provide a flexible and scalable approach. Empirical results include linear and nonlinear simulations (where the underlying causal structures are known and can be directly compared against), as well as a real biological example where the models are applied to high-dimensional molecular data and their output compared against entirely unseen validation experiments. These results demonstrate the feasibility of using deep learning approaches to learn causal networks in large-scale problems spanning thousands of variables.
翻译:近些年来,在因果关系与机器学习交汇点上取得了迅速的进展,在涉及高维数据的科学应用,特别是在生物医学中,我们提出了一个深厚的神经结构,用于从经验数据与先前因果知识相结合的变量之间学习因果关系。我们把进化和图形神经网络结合在一个因果风险框架内,以提供灵活和可扩展的方法。经验性结果包括线性和非线性模拟(即基本因果结构为人知并可以直接比较),以及一个实际的生物实例,即模型应用于高维分子数据及其输出,而与完全无法见的验证实验相比。这些结果表明,在涉及数千个变量的大规模问题中,采用深层次学习方法学习因果网络是可行的。