Although spoken language understanding (SLU) has achieved great success in high-resource languages, such as English, it remains challenging in low-resource languages mainly due to the lack of high quality training data. The recent multilingual code-switching approach samples some words in an input utterance and replaces them by expressions in some other languages of the same meaning. The multilingual code-switching approach achieves better alignments of representations across languages in zero-shot cross-lingual SLU. Surprisingly, all existing multilingual code-switching methods disregard the inherent semantic structure in SLU, i.e., most utterances contain one or more slots, and each slot consists of one or more words. In this paper, we propose to exploit the "utterance-slot-word" structure of SLU and systematically model this structure by a multi-level contrastive learning framework at the utterance, slot, and word levels. We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels. Furthermore, we develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer. Our experimental results show that our proposed methods significantly improve the performance compared with the strong baselines on two zero-shot cross-lingual SLU benchmark datasets.
翻译:虽然口语理解(SLU)在英语等高资源语言方面取得了巨大成功,但在低资源语言方面仍然具有挑战性,主要原因是缺乏高质量的培训数据。最近的多语言代码转换方法在输入语句中抽取了一些词,用其他语言的表达方式取而代之。多语言代码转换方法在语言之间实现了更好的一致,在语言上跨语言语言语言的零点跨语言语言语言语言的表达方式。令人惊讶的是,所有现有的多语言代码转换方法都无视SLU固有的语义结构,即大多数语句包含一个或一个以上的空格,每个空格都包含一个或一个以上的单词。在本文件中,我们提议利用SLU的“低调-点字”结构,并系统地通过一个多层次的反向学习框架来模拟这一结构。我们开发了新的代码转换方法,以便为各级的对比性学习生成了硬的负面例子。此外,我们开发了一个标签识别联合模型,用以利用定义词义词义的词义结构,将双面基线性数据转换结果显著地展示了我们提出的高水平基准数据。