Causal investigations in observational studies pose a great challenge in scientific research where randomized trials or intervention-based studies are not feasible. Leveraging Shannon's seminal work on information theory, we consider a framework of asymmetry where any causal link between putative cause and effect must be explained through a mechanism governing the cause as well as a generative process yielding an effect of the cause. Under weak assumptions, this framework enables the assessment of whether X is a stronger predictor of Y or vice-versa. Under stronger identifiability assumptions our framework is able to distinguish between cause and effect using observational data. We establish key statistical properties of this framework. Our proposed methodology relies on scalable non-parametric density estimation using fast Fourier transformation. The resulting estimation method is manyfold faster than the classical bandwidth-based density estimation while maintaining comparable mean integrated squared error rates. We investigate key asymptotic properties of our methodology and introduce a data-splitting technique to facilitate inference. The key attraction of our framework is its inference toolkit, which allows researchers to quantify uncertainty in causal discovery findings. We illustrate the performance of our methodology through simulation studies as well as multiple real data examples.
翻译:暂无翻译