Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries and serves as an essential component of virtual assistants. Current TCSP models rely on numerous training data to achieve decent performance but fail to generalize to low-resource target languages or domains. In this paper, we present X2Parser, a transferable Cross-lingual and Cross-domain Parser for TCSP. Unlike previous models that learn to generate the hierarchical representations for nested intents and slots, we propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems. After that, we further propose a fertility-based slot predictor that first learns to dynamically detect the number of labels for each token, and then predicts the slot types. Experimental results illustrate that our model can significantly outperform existing strong baselines in cross-lingual and cross-domain settings, and our model can also achieve a good generalization ability on target languages of target domains. Furthermore, our model tackles the problem in an efficient non-autoregressive way that reduces the latency by up to 66% compared to the generative model.
翻译:以任务为导向的构成语义解析(TCSP)处理复杂的嵌套用户查询,并成为虚拟助理的基本组成部分。目前的TCSP模型依靠大量培训数据来达到体面的业绩,但未能概括到低资源目标语言或领域。在本文中,我们为TCSP提供了X2Parker,一个可转让的跨语言和跨主题剖析器。与以前为嵌套意图和空档生成等级代表的模型不同,我们提议分别预测扁平的意向和空档代表,并将两个预测任务都置于顺序标签问题中。之后,我们进一步提议一个基于生育力的位置预测器,首先学会动态地检测每个符号的标签数量,然后预测空档类型。实验结果表明,我们的模型可以大大超过跨语言和跨领域环境中现有的强大基线,而我们的模型也可以在目标区域的目标语言上实现良好的概括能力。此外,我们的模式可以以高效的、无偏向性的方式解决问题,从而将胶囊缩小到66%的基因对比模型。