The convenient access to copious multi-faceted data has encouraged machine learning researchers to reconsider correlation-based learning and embrace the opportunity of causality-based learning, i.e., causal machine learning (causal learning). Recent years have therefore witnessed great effort in developing causal learning algorithms aiming to help AI achieve human-level intelligence. Due to the lack-of ground-truth data, one of the biggest challenges in current causal learning research is algorithm evaluations. This largely impedes the cross-pollination of AI and causal inference, and hinders the two fields to benefit from the advances of the other. To bridge from conventional causal inference (i.e., based on statistical methods) to causal learning with big data (i.e., the intersection of causal inference and machine learning), in this survey, we review commonly-used datasets, evaluation methods, and measures for causal learning using an evaluation pipeline similar to conventional machine learning. We focus on the two fundamental causal-inference tasks and causality-aware machine learning tasks. Limitations of current evaluation procedures are also discussed. We then examine popular causal inference tools/packages and conclude with primary challenges and opportunities for benchmarking causal learning algorithms in the era of big data. The survey seeks to bring to the forefront the urgency of developing publicly available benchmarks and consensus-building standards for causal learning evaluation with observational data. In doing so, we hope to broaden the discussions and facilitate collaboration to advance the innovation and application of causal learning.
翻译:方便地获取大量多面数据,鼓励机器学习研究人员重新考虑基于关联的学习,抓住基于因果关系的学习机会,即因果机学(因果机学(因果学习),因此,近年来在开发因果学习算法方面作出了很大努力,旨在帮助AI实现人类层面的智能;由于缺乏地面真相数据,目前因果学习研究的最大挑战之一是算法评价。这在很大程度上阻碍了AI的交叉推断和因果推断,阻碍了两个领域从另一个领域的进展中受益。从传统的因果推断(即基于统计方法的因果学习)到与大数据(即因果关系推断和机学的交叉)的因果学习,在本次调查中,我们审查常用的数据集、评价方法和因果学习措施,利用与常规机器学习相似的评价管道。我们着重研究两个基本的因果推断任务和因果感应变先验的机器学习任务。目前评价程序的局限性也得到了讨论。我们随后通过大众因果推介的因果推介,将现有的因果分析工具/机机理学标准与可获取的机理学标准结合起来,并研究如何利用现有的因果分析。