Causal investigations in observational studies pose a great challenge in scientific research where randomized trials or intervention-based studies are not feasible. Leveraging Shannon's seminal work on information theory, we develop a causal discovery framework of "predictive asymmetry" for bivariate $(X, Y)$. Predictive asymmetry is a central concept in information geometric causal inference; it enables assessment of whether $X$ is a stronger predictor of $Y$ or vice-versa. We propose a new metric called the Asymmetric Mutual Information ($AMI$) and establish its key statistical properties. The $AMI$ is not only able to detect complex non-linear association patterns in bivariate data, but also is able to detect and quantify predictive asymmetry. Our proposed methodology relies on scalable non-parametric density estimation using fast Fourier transformation. The resulting estimation method is manyfold faster than the classical bandwidth-based density estimation, while maintaining comparable mean integrated squared error rates. We investigate key asymptotic properties of the $AMI$ methodology; a new data-splitting technique is developed to make statistical inference on predictive asymmetry using the $AMI$. We illustrate the performance of the $AMI$ methodology through simulation studies as well as multiple real data examples.
翻译:在随机试验或干预研究不可行的情况下,观察研究的因果关系调查对科学研究构成巨大挑战。利用香农关于信息理论的开创性工作,我们为双变(X,Y)美元开发了一个“预测不对称”的因果关系发现框架。预测不对称是信息几何因果关系推论中的核心概念;它能够评估美元是较强的Y美元预测值还是反差值。我们提议了一个新的指标,称为“非对称相互信息”(AMI美元),并建立了其主要统计属性。美元不仅无法在双差数据中发现复杂的非线性联系模式,而且能够检测和量化预测不对称。我们提议的方法依靠快速四倍变换的可缩放非参数密度估计值。由此得出的估计方法比典型的带宽密度估计值要快很多,同时保持可比的平均值综合正方差率率。我们研究了美元方法的关键性反差特性。我们开发了一个新的数据分裂技术,以便用美元作为模拟数据模型,用美元作为模拟数据模型来进行统计。