使用培训前和语义损失进行符号预测的神经特征适应 (Neural Feature-Adaptation for Symbolic Predictions Using Pre-Training and Semantic Loss)

We are interested in neurosymbolic systems consisting of a high-level symbolic layer for explainable prediction in terms of human-intelligible concepts; and a low-level neural layer for extracting symbols required to generate the symbolic explanation. Real data is often imperfect meaning that even if the symbolic theory remains unchanged, we may still need to address the problem of mapping raw data to high-level symbols, each time there is a change in the data acquisition environment or equipment. Manual (re-)annotation of the raw data each time this happens is laborious and expensive; and automated labelling methods are often imperfect, especially for complex problems. NEUROLOG proposed the use of a semantic loss function that allows an existing feature-based symbolic model to guide the extraction of feature-values from raw data, using `abduction'. However, the experiments demonstrating the use of semantic loss through abduction appear to rely heavily on a domain-specific pre-processing step that enables a prior delineation of feature locations in the raw data. We examine the use of semantic loss in domains where such pre-processing is not possible, or is not obvious. We show that without any prior information about the features, the NEUROLOG approach can continue to predict accurately even with substantially incorrect feature predictions. We show also that prior information about the features in the form of even imperfect pre-training can help correct this situation. These findings are replicated on the original problem considered by NEUROLOG, without the use of feature-delineation. This suggests that symbolic explanations constructed for data in a domain could be re-used in a related domain, by `feature-adaptation' of pre-trained neural extractors using the semantic loss function constrained by abductive feedback.

翻译：我们感兴趣的是神经感应系统,这些系统由高层次的象征层组成,用于在可理解的人类概念方面作出可解释的预测;以及低层次的神经层组成,用于提取象征性解释所需的符号。真正的数据往往是不完美的,这意味着即使象征性理论没有改变,我们仍然需要解决将原始数据映射为高级符号的问题,每次数据采集环境或设备发生变化时,我们仍然需要解决将原始数据映射为高级符号的问题。每次发生这种情况时,对原始数据进行人工(重新)批注,这是很费钱的;自动标签方法往往不完善,特别是对于复杂的问题。 NEUROLOG 提议使用一个基于符号的象征性损失功能,使现有的基于特征的符号模型能够指导从原始数据中提取特征值,使用“感应”。然而,试验表明通过绑架而使用语系损失之前的预处理步骤,使得这些原始数据的位置能够预先确定;我们研究在这类预处理之前不可能继续的领域使用语系损失的情况,或者甚至不明显地显示,在先前的信息中,我们还可以准确地显示先前的信息的预变变。