Causal discovery is to learn cause-effect relationships among variables given observational data and is important for many applications. Existing causal discovery methods assume data sufficiency, which may not be the case in many real world datasets. As a result, many existing causal discovery methods can fail under limited data. In this work, we propose Bayesian-augmented frequentist independence tests to improve the performance of constraint-based causal discovery methods under insufficient data: 1) We firstly introduce a Bayesian method to estimate mutual information (MI), based on which we propose a robust MI based independence test; 2) Secondly, we consider the Bayesian estimation of hypothesis likelihood and incorporate it into a well-defined statistical test, resulting in a robust statistical testing based independence test. We apply proposed independence tests to constraint-based causal discovery methods and evaluate the performance on benchmark datasets with insufficient samples. Experiments show significant performance improvement in terms of both accuracy and efficiency over SOTA methods.
翻译:现有因果发现方法假定了数据是否充足,而许多真实世界数据集中的情况可能并非如此。因此,许多现有的因果发现方法在有限的数据下可能失败。在这项工作中,我们提议采用巴耶斯人推荐的常客独立测试,以改进基于限制的因果发现方法在数据不足的情况下的性能:1)我们首先采用巴伊斯人方法来估计相互信息,我们据此提议一个基于MI的强有力独立测试;2)我们考虑巴伊西亚人对假设可能性的估计,并将其纳入一个定义明确的统计测试,从而形成一个基于独立测试的稳健的统计测试。我们建议采用基于限制的因果发现方法,并用不足的样本评估基准数据集的性能。实验表明,相对于SOTA方法,在准确性和效率两方面都取得了显著的绩效改进。