Multi-domain text classification can automatically classify texts in various scenarios. Due to the diversity of human languages, texts with the same label in different domains may differ greatly, which brings challenges to the multi-domain text classification. Current advanced methods use the private-shared paradigm, capturing domain-shared features by a shared encoder, and training a private encoder for each domain to extract domain-specific features. However, in realistic scenarios, these methods suffer from inefficiency as new domains are constantly emerging. In this paper, we propose a robust contrastive alignment method to align text classification features of various domains in the same feature space by supervised contrastive learning. By this means, we only need two universal feature extractors to achieve multi-domain text classification. Extensive experimental results show that our method performs on par with or sometimes better than the state-of-the-art method, which uses the complex multi-classifier in a private-shared framework.
翻译:多主文本分类可以自动对各种情景中的文本进行分类。由于人类语言的多样性,不同领域具有相同标签的文本可能会有很大差异,这给多主文本分类带来挑战。当前先进方法使用私有共享模式,通过共享编码器捕捉共享域特征,并为每个领域培训私人编码器以提取特定域特征。然而,在现实情况下,这些方法效率低下,因为新领域不断出现。在本文件中,我们建议采用一种强有力的对比性协调方法,通过监督对比性学习,将不同域的文本分类特征与同一特性空间的文本分类特征统一起来。通过这个方法,我们只需要两个通用的特征提取器来实现多主文本分类。广泛的实验结果显示,我们的方法与最先进的方法相同,有时比在私有共享框架内使用复杂的多分类器。