Causal discovery for quantitative data has been extensively studied but less is known for categorical data. We propose a novel causal model for categorical data based on a new classification model, termed classification with optimal label permutation (COLP). By design, COLP is a parsimonious classifier, which gives rise to a provably identifiable causal model. A simple learning algorithm via comparing likelihood functions of causal and anti-causal models suffices to learn the causal direction. Through experiments with synthetic and real data, we demonstrate the favorable performance of the proposed COLP-based causal model compared to state-of-the-art methods. We also make available an accompanying R package COLP, which contains the proposed causal discovery algorithm and a benchmark dataset of categorical cause-effect pairs.
翻译:对定量数据的因果发现进行了广泛研究,但对绝对数据则不甚了解。我们提出了一个基于新分类模式的绝对数据的新因果模型,称为最佳标签变异分类(COLP ) 。从设计上看,COLP是一个典型的分类者,产生一个可辨别的因果模型。通过比较因果和反因果模型的可能功能,简单的学习算法足以了解因果方向。通过对合成和真实数据的实验,我们展示了提议的基于COLP的因果模型相对于最新方法的有利性。我们还提供了配套的R包式COLP,其中包含拟议的因果发现算法和绝对因果配对的基准数据集。