Neuro-symbolic predictors learn a mapping from sub-symbolic inputs to higher-level concepts and then carry out (probabilistic) logical inference on this intermediate representation. This setup offers clear advantages in terms of consistency to symbolic prior knowledge, and is often believed to provide interpretability benefits in that - by virtue of complying with the knowledge - the learned concepts can be better understood by human stakeholders. However, it was recently shown that this setup is affected by reasoning shortcuts whereby predictions attain high accuracy by leveraging concepts with unintended semantics, yielding poor out-of-distribution performance and compromising interpretability. In this short paper, we establish a formal link between reasoning shortcuts and the optima of the loss function, and identify situations in which reasoning shortcuts can arise. Based on this, we discuss limitations of natural mitigation strategies such as reconstruction and concept supervision.
翻译:神经符号预测器学习从子符号输入到更高层概念的映射,然后对这些中间表示进行(概率)逻辑推理。这种设置在符号先验知识的一致性方面具有明显优势,并经常被认为在符合知识的情况下提供可解释性和更好的人类利益。然而,最近已经证明,这种设置受到推理快捷方式的影响,其中预测通过利用具有意外语义的概念而获得高准确度,导致了差出分布性能并损害了可解释性。在这篇简短的论文中,我们建立了推理快捷方式与损失函数最优解之间的正式联系,并确定了推理快捷方式可能出现的情况。基于此,我们讨论了自然缓解策略,如重建和概念监督的局限性。