Recent work in computer vision and cognitive reasoning has given rise to an increasing adoption of the Violation-of-Expectation (VoE) paradigm in synthetic datasets. Inspired by infant psychology, researchers are now evaluating a model's ability to label scenes as either expected or surprising with knowledge of only expected scenes. However, existing VoE-based 3D datasets in physical reasoning provide mainly vision data with little to no heuristics or inductive biases. Cognitive models of physical reasoning reveal infants create high-level abstract representations of objects and interactions. Capitalizing on this knowledge, we established a benchmark to study physical reasoning by curating a novel large-scale synthetic 3D VoE dataset armed with ground-truth heuristic labels of causally relevant features and rules. To validate our dataset in five event categories of physical reasoning, we benchmarked and analyzed human performance. We also proposed the Object File Physical Reasoning Network (OFPR-Net) which exploits the dataset's novel heuristics to outperform our baseline and ablation models. The OFPR-Net is also flexible in learning an alternate physical reality, showcasing its ability to learn universal causal relationships in physical reasoning to create systems with better interpretability.
翻译:最近在计算机视觉和认知推理方面开展的工作促使人们越来越多地采用合成数据集中的违背预期(VoE)模式。受婴儿心理学的启发,研究人员目前正在评估模型是否有能力将场景标为预期的或出乎意料的,只对预期的场景有所了解。然而,在物理推理中现有的基于VoE的3D数据集主要提供视觉数据,很少或没有偏差或感知偏差。物理推理的认知模型显示婴儿创造了高层次的抽象物体和相互作用。利用这一知识,我们建立了研究物理推理的基准,将新的大型合成3D VoE数据集整理成一个新的大规模合成3D VoE数据集,配有与因果关系特点和规则有关的地面的超真真伪标签。要验证五类事件物理推理中的数据集,我们为人类表现进行了基准和分析。我们还提议了对象档案物理推理网络(OFPR-Net)利用该数据库的新型超常理学来超越我们的基线和对比模型。在物理推理学方面,OFPR-Net还灵活地学习了物理推理能力,以更好的物理推理学能力来创造。