DAG(Directed Acyclic Graph) from causal inference does not differentiate causal effects and correlated changes. And the general effect of a population is usually approximated by averaging correlations over all individuals. Since AI(Artificial Intelligence) enables large-scale structure modeling on big data, the complex hidden confoundings have made these approximation errors no longer ignorable but snowballed to considerable modeling bias - Such Causal Representation Bias (CRB) leads to many problems: ungeneralizable causal models, unrevealed individual-level features, hardly utilized causal knowledge in DL(Deep Learning), etc. In short, DAG must be redefined to enable a new framework for causal AI. The observational time series in statistics can only represent correlated changes, while the DL-based autoencoder can represent them as individualized feature changes in latent space to estimate the causal effects directly. In this paper, we introduce the redefined do-DAG to visualize CRB, propose a generic solution Causal Representation Learning (CRL) framework, along with a novel architecture for its realization, and experimentally verify the feasibility.
翻译:因果推断中的有向无环图(DAG)无法区分因果效应和相关变化。通常情况下,针对一个群体的一般性效应是通过对所有个体的相关性进行平均来近似计算的。由于人工智能(AI)能够对大数据进行大规模结构建模,复杂的隐含混淆使得这种近似误差不再可忽略,而会导致相当大的建模偏差——这种因果表示偏差(CRB)导致了很多问题:不可推广的因果模型、未揭示的个体级特征,以及在深度学习(DL)中难以利用的因果知识等。简而言之,必须重新定义 DAG 以实现因果 AI 的新框架。统计学中的观测时间序列只能表示相关变化,而基于 DL 的自编码器可以将它们表示为潜在空间中的个性化特征变化,以直接估计因果效应。本文介绍了重新定义的 do-DAG 以可视化 CRB,提出了一个通用的因果表示学习(CRL)框架,以及其实现的一种新型架构,并进行了实验验证其可行性。