Despite the great success of spoken language understanding (SLU) in high-resource languages, it remains challenging in low-resource languages mainly due to the lack of labeled training data. The recent multilingual code-switching approach achieves better alignments of model representations across languages by constructing a mixed-language context in zero-shot cross-lingual SLU. However, current code-switching methods are limited to implicit alignment and disregard the inherent semantic structure in SLU, i.e., the hierarchical inclusion of utterances, slots, and words. In this paper, we propose to model the utterance-slot-word structure by a multi-level contrastive learning framework at the utterance, slot, and word levels to facilitate explicit alignment. Novel code-switching schemes are introduced to generate hard negative examples for our contrastive learning framework. Furthermore, we develop a label-aware joint model leveraging label semantics to enhance the implicit alignment and feed to contrastive learning. Our experimental results show that our proposed methods significantly improve the performance compared with the strong baselines on two zero-shot cross-lingual SLU benchmark datasets.
翻译:尽管高资源语言的口语理解(SLU)取得了巨大成功,但在低资源语言方面仍然具有挑战性,主要原因是缺乏标签的培训数据。最近的多语言代码转换方法通过在零点跨语言语言语言的SLU中构建混合语言背景,实现了不同语言模式代表的更好一致。然而,目前的代码转换方法仅限于隐含的对齐,并无视SLU固有的语义结构,即词句、空格和文字的等级整合。在本文中,我们建议用多层次的对比式学习框架来模拟语句、空档和字句结构,以便利明确的对齐。引入了新代码转换方法,以便为我们的对比性学习框架产生硬的负面例子。此外,我们开发了一个有标签意识的联合模型,利用标签语义来强化隐含的对齐,为对比性学习提供反馈。我们的实验结果表明,我们提出的方法大大改进了绩效,而比两个零点交叉语言 SLU基准数据集的强基线要高得多。