Inferring causal directions on discrete and categorical data is an important yet challenging problem. Even though the additive noise models (ANMs) approach can be adapted to the discrete data, the functional structure assumptions make it not applicable on categorical data. Inspired by the principle that the cause and mechanism are independent, various methods have been developed, leveraging independence tests such as the distance correlation measure. In this work, we take an alternative perspective and propose a subsampling-based method to test the independence between the generating schemes of the cause and that of the mechanism. Our methodology works for both discrete and categorical data and does not imply any functional model on the data, making it a more flexible approach. To demonstrate the efficacy of our methodology, we compare it with existing baselines over various synthetic data and real data experiments.
翻译:对离散和绝对数据进行因果分析是一个重要但具有挑战性的问题。尽管添加噪音模型(ANMS)方法可以适应离散数据,但功能结构假设却使该方法不适用于绝对数据。受原因和机制是独立的原则的启发,已经开发了各种方法,利用独立测试,如远程相关测量。在这项工作中,我们从另一个角度出发,提出一个基于子抽样的方法,以测试产生原因和机制之间的独立性。我们的方法适用于离散和绝对数据,并不意味着数据的任何功能模型,使之更灵活。为了显示我们方法的效力,我们将其与各种合成数据和真实数据实验的现有基线进行比较。