Causal DAGs(Directed Acyclic Graphs) are usually considered in a 2D plane. Edges indicate causal effects' directions and imply their corresponding time-passings. Due to the natural restriction of statistical models, effect estimation is usually approximated by averaging the individuals' correlations, i.e., observational changes over a specific time. However, in the context of Machine Learning on large-scale questions with complex DAGs, such slight biases can snowball to distort global models - More importantly, it has practically impeded the development of AI, for instance, the weak generalizability of causal models. In this paper, we redefine causal DAG as \emph{do-DAG}, in which variables' values are no longer time-stamp-dependent, and timelines can be seen as axes. By geometric explanation of multi-dimensional do-DAG, we identify the \emph{Causal Representation Bias} and its necessary factors, differentiated from common confounding biases. Accordingly, a DL(Deep Learning)-based framework will be proposed as the general solution, along with a realization method and experiments to verify its feasibility.
翻译:因果有向无环图 (DAG) 通常在二维平面中被考虑。边表示因果效应的方向并暗示它们相应的时间推移。由于统计模型的自然限制,效应估计通常通过平均个体之间的相关性即某个特定时间内的观察值变化来近似。然而,在大规模复杂 DAG 上进行机器学习时,这种轻微的偏差可能会导致扭曲全局模型- 更重要的是,它实际上阻碍了 AI 的发展,例如因果模型的弱泛化能力。在本文中,我们将因果 DAG 重新定义为 do-DAG,其中变量的值不再依赖于时间戳,并且时间线可以被视为轴。通过多维 do-DAG 的几何说明,我们识别出 \emph{因果表示偏差} 及其必要因素,该因果表示偏差与常见的混淆偏差不同。因此,我们提出了一个基于 DL 的框架作为一种通用解决方案,同时提供一种实现方法和实验证明其可行性。