Using the shared-private paradigm and adversarial training has significantly improved the performances of multi-domain text classification (MDTC) models. However, there are two issues for the existing methods. First, instances from the multiple domains are not sufficient for domain-invariant feature extraction. Second, aligning on the marginal distributions may lead to fatal mismatching. In this paper, we propose a mixup regularized adversarial network (MRAN) to address these two issues. More specifically, the domain and category mixup regularizations are introduced to enrich the intrinsic features in the shared latent space and enforce consistent predictions in-between training instances such that the learned features can be more domain-invariant and discriminative. We conduct experiments on two benchmarks: The Amazon review dataset and the FDU-MTL dataset. Our approach on these two datasets yields average accuracies of 87.64\% and 89.0\% respectively, outperforming all relevant baselines.
翻译:使用共同-私人模式和对抗性培训大大改善了多域文本分类模式的性能,但现有方法有两个问题:第一,多个领域的事例不足以产生域内差异性特征;第二,在边际分布上调整可能导致致命的不匹配;在本文件中,我们提议建立一个混合的正规对抗网络(MRAN),以解决这两个问题;更具体地说,采用域和类别混合的正规化,以丰富共享潜在空间的内在特征,并在培训实例之间执行一致的预测,使学到的特征更具有域内差异性和歧视性。我们在两个基准上进行了实验:亚马逊审查数据集和FDU-MTL数据集。我们在这两个数据集上采用的方法平均达到87.64 ⁇ 和89.0 ⁇,超过了所有相关基线。