Neural Machine Translation (NMT) models from English to SPARQL are a promising development for SPARQL query generation. However, current architectures are unable to integrate the knowledge base (KB) schema and handle questions on knowledge resources, classes, and properties unseen during training, rendering them unusable outside the scope of topics covered in the training set. Inspired by the performance gains in natural language processing tasks, we propose to integrate a copy mechanism for neural SPARQL query generation as a way to tackle this issue. We illustrate our proposal by adding a copy layer and a dynamic knowledge base vocabulary to two Seq2Seq architectures (CNNs and Transformers). This layer makes the models copy KB elements directly from the questions, instead of generating them. We evaluate our approach on state-of-the-art datasets, including datasets referencing unknown KB elements and measure the accuracy of the copy-augmented architectures. Our results show a considerable increase in performance on all datasets compared to non-copy architectures.
翻译:从英文到SPARQL的神经机器翻译模型(NMT)模型是SPARQL 查询生成的一个很有希望的发展,然而,目前的结构无法将知识库(KB)模型集集成,无法处理培训期间未见的知识资源、阶级和属性问题,使其无法在培训组所涉专题范围之外使用。在自然语言处理任务的绩效收益的启发下,我们提议将神经智能翻译模型集成一个复制机制,作为解决这一问题的一种方法。我们通过在两个Seq2Seq 结构(CNNs和变异器)中添加一个复制层和一个动态的知识库词汇来说明我们的提议。这个层直接从问题中复制了模型KB元素,而不是生成这些元素。我们评估了我们关于最新数据集的方法,包括参考未知的KB元素的数据集,并衡量复制结构的准确性。我们的结果显示,所有数据集相对于非复制结构的性能都有相当大的提高。