Natural language understanding (NLU) has two core tasks: intent classification and slot filling. The success of pre-training language models resulted in a significant breakthrough in the two tasks. One of the promising solutions called BERT can jointly optimize the two tasks. We note that BERT-based models convert each complex token into multiple sub-tokens by wordpiece algorithm, which generates a mismatch between the lengths of the tokens and the labels. This leads to BERT-based models do not do well in label prediction which limits model performance improvement. Many existing models can be compatible with this issue but some hidden semantic information is discarded in the fine-tuning process. We address the problem by introducing a novel joint method on top of BERT which explicitly models the multiple sub-tokens features after wordpiece tokenization, thereby contributing to the two tasks. Our method can well extract the contextual features from complex tokens by the proposed sub-words attention adapter (SAA), which preserves overall utterance information. Additionally, we propose an intent attention adapter (IAA) to obtain the full sentence features to aid users to predict intent. Experimental results confirm that our proposed model is significantly improved on two public benchmark datasets. In particular, the slot filling F1 score is improved from 96.1 to 98.2 (2.1% absolute) on the Airline Travel Information Systems (ATIS) dataset.
翻译:自然语言理解( NLU) 有两个核心任务: 意图分类和空档填充。 培训前语言模型的成功导致两个任务的重大突破。 一个叫BERT的有希望的解决方案可以共同优化这两个任务。 我们注意到, BERT 的模型将每个复杂的符号转换成多次方制字形算法,这在代号长度和标签之间造成不匹配。 这导致基于BERT的模型在标签预测上不很好,从而限制了模型改进的功能。 许多现有模型可以与这一问题兼容,但在微调过程中,一些隐藏的语义信息被丢弃。 我们通过在BERT顶端采用新颖的联合方法解决这个问题,该方法在字形代号后明确模拟多个子名特征,从而对这两项任务作出贡献。 我们的方法可以将背景特征从拟议的子词调适配码调控件(SAAA) 中提取出复杂的代号,该代号保存了总体的全局信息。 此外, 我们提议一个意图调整器( IAAAA) 以获得帮助用户预测意图的全句特征。 我们通过实验性结果的结果是, 在BER2 AS 1 的绝对值中, AS AS 。