The dependency tree of a natural language sentence can capture the interactions between semantics and words. However, it is unclear whether those methods which exploit such dependency information for semantic parsing can be combined to achieve further improvement and the relationship of those methods when they combine. In this paper, we examine three methods to incorporate such dependency information in a Transformer based semantic parser and empirically study their combinations. We first replace standard self-attention heads in the encoder with parent-scaled self-attention (PASCAL) heads, i.e., the ones that can attend to the dependency parent of each token. Then we concatenate syntax-aware word representations (SAWRs), i.e., the intermediate hidden representations of a neural dependency parser, with ordinary word embedding to enhance the encoder. Later, we insert the constituent attention (CA) module to the encoder, which adds an extra constraint to attention heads that can better capture the inherent dependency structure of input sentences. Transductive ensemble learning (TEL) is used for model aggregation, and an ablation study is conducted to show the contribution of each method. Our experiments show that CA is complementary to PASCAL or SAWRs, and PASCAL + CA provides state-of-the-art performance among neural approaches on ATIS, GEO, and JOBS.
翻译:自然语言句的依附树可以捕捉语义和文字之间的相互作用。 但是,尚不清楚的是,那些利用这些依赖信息进行语义区分的方法是否可以结合起来,以进一步改进和这些方法之间的关系。 在本文中,我们研究三种方法,将这种依附信息纳入基于变异器的语义剖析器,并实验研究其组合。我们首先将编码中的标准自省头替换为父级自省(PASCAL)头,即能够照顾到每个符号的依附父父母的那些方法。然后,我们将通识词表解(SAWWRSs),即神经依赖剖析器的中间隐藏表达方式纳入基于变异器的语义解析器,然后,我们将构成注意(CA)模块插入编码器,这给注意力头增加了额外的制约,可以更好地捕捉输入句的内在依赖性结构。在模型汇总中使用传导词串联式学习(TEL),然后,我们将通识字义表达“CA-SAR-L”系统实验的每一种方法。