Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal role in modern scientific discovery, especially in scenarios with only sparse and limited data. Humans, even young toddlers, can induce causal relationships surprisingly well in various settings despite its notorious difficulty. However, in contrast to the commonplace trait of human cognition is the lack of a diagnostic benchmark to measure causal induction for modern Artificial Intelligence (AI) systems. Therefore, in this work, we introduce the Abstract Causal REasoning (ACRE) dataset for systematic evaluation of current vision systems in causal induction. Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario: direct, indirect, screening-off, and backward-blocking, intentionally going beyond the simple strategy of inducing causal relationships by covariation. By analyzing visual reasoning architectures on this testbed, we notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning. These deficiencies call for future research in models with a more comprehensive capability of causal induction.
翻译:与常见的人类认知特征不同的是,缺乏一种诊断基准来衡量现代人工智能(AI)系统诱因诱因诱因诱因诱因的诱因。因此,在这项工作中,我们引入了Causal reason(ACRE)数据集,以系统评估当前因果诱导系统。我们注意到,纯粹的神经模型趋向于在其概率水平的实验中采用联动战略。我们根据对Blicket实验中因果发现的研究流,用视觉推理系统,在独立情景或干预情景中用以下四种问题进行查询:直接的、间接的、筛选的和后方阻塞,有意超越通过共变诱导因果关系的简单战略。我们通过分析这一测试床的视觉推理结构,我们发现纯净的神经模型倾向于在其概率级性能实验中采用联动战略,而后期的神经系统诱导演化能力则与后期的神经系统演化能力相结合。