Pre-trained seq2seq models excel at graph semantic parsing with rich annotated data, but generalize worse to out-of-distribution (OOD) and long-tail examples. In comparison, symbolic parsers under-perform on population-level metrics, but exhibit unique strength in OOD and tail generalization. In this work, we study compositionality-aware approach to neural-symbolic inference informed by model confidence, performing fine-grained neural-symbolic reasoning at subgraph level (i.e., nodes and edges) and precisely targeting subgraph components with high uncertainty in the neural parser. As a result, the method combines the distinct strength of the neural and symbolic approaches in capturing different aspects of the graph prediction, leading to well-rounded generalization performance both across domains and in the tail. We empirically investigate the approach in the English Resource Grammar (ERG) parsing problem on a diverse suite of standard in-domain and seven OOD corpora. Our approach leads to 35.26% and 35.60% error reduction in aggregated Smatch score over neural and symbolic approaches respectively, and 14% absolute accuracy gain in key tail linguistic categories over the neural model, outperforming prior state-of-art methods that do not account for compositionality or uncertainty.
翻译:培训前的后世2seq 模型在图形语义分析中表现优异,具有丰富的附加说明的数据,但一般化为更差,甚至超出分布范围(OOOD)和长尾例子。相比之下,象征性分析师在人口水平指标方面表现不佳,但在OOOD和尾部概括化方面表现出独特的优势。在这项工作中,我们研究了以模型信任为根据的神经-听觉推断方法,在子绘图层(即,节点和边缘)进行细微的神经-同步推理,并准确地针对神经剖析器中高度不确定性的子组组成部分。结果,这种方法结合了神经和象征性方法的独特力量,以图预测的不同方面,导致跨领域和尾部的全局性概括性表现。我们实证地研究了英国资源格拉姆玛尔(ERG)在一系列标准(即,即分层和7 OOD Cora)中区分问题的方法。我们的方法导致在神经剖面剖面分析中,没有35.26%和35.60 %的绝对神经结构,在总体级排序上减少了之前的精确度排序。