Semantic parsing plays a key role in digital voice assistants such as Alexa, Siri, and Google Assistant by mapping natural language to structured meaning representations. When we want to improve the capabilities of a voice assistant by adding a new domain, the underlying semantic parsing model needs to be retrained using thousands of annotated examples from the new domain, which is time-consuming and expensive. In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot). We use a base seq2seq (sequence-to-sequence) architecture and augment it with a concept encoder that encodes intent and slot tags from the new domain. We also introduce a novel decoder-focused approach to pretrain seq2seq models to be concept aware using Wikidata and use it to help our model learn important concepts and perform well in low-resource settings. We report few-shot and zero-shot results for compositional semantic parsing on the TOPv2 dataset and show that our model outperforms prior approaches in few-shot settings for the TOPv2 and SNIPS datasets.
翻译:语义分析在Alexa、 Siri 和 Google 助理等数字语音助理中发挥着关键作用, 诸如 Alexa、 Siri 和 Google 助理, 通过将自然语言映射成结构化的表达方式, 在数字语音助理中扮演关键角色 。 当我们想要通过添加一个新域来提高声音助理的能力时, 基本的语义分析模型需要使用来自新域的数千个附加说明的例子进行再培训, 这个新域既耗时又昂贵。 在这项工作中, 我们提出了一个自动进行这种域适应的架构, 只有少量关于新域的元数据, 并且没有任何新的培训数据( 零弹射), 或只有很少的例子( 图片 ) 。 我们使用一个基础后导( 后导到序列) 架构, 并用一个概念编码器来将新域的意向和空格标记编码。 我们还对后导后导2 模型模型模型模型采用了一种注重新概念的方法, 以便使用维基数据来帮助我们模型学习重要的概念, 在低资源环境下很好地执行。 我们报告在前 TOP2 数据设置中, 显示我们之前的模型 数据设置中, 和 SPSpreforshet 数据设置中, 展示了几个 的方法。