As an extensive research in the field of natural language processing (NLP), aspect-based sentiment analysis (ABSA) is the task of predicting the sentiment expressed in a text relative to the corresponding aspect. Unfortunately, most languages lack sufficient annotation resources, thus more and more recent researchers focus on cross-lingual aspect-based sentiment analysis (XABSA). However, most recent researches only concentrate on cross-lingual data alignment instead of model alignment. To this end, we propose a novel framework, CL-XABSA: Contrastive Learning for Cross-lingual Aspect-Based Sentiment Analysis. Based on contrastive learning, we close the distance between samples with the same label in different semantic spaces, thus achieving a convergence of semantic spaces of different languages. Specifically, we design two contrastive strategies, token level contrastive learning of token embeddings (TL-CTE) and sentiment level contrastive learning of token embeddings (SL-CTE), to regularize the semantic space of source and target language to be more uniform. Since our framework can receive datasets in multiple languages during training, our framework can be adapted not only for XABSA task but also for multilingual aspect-based sentiment analysis (MABSA). To further improve the performance of our model, we perform knowledge distillation technology leveraging data from unlabeled target language. In the distillation XABSA task, we further explore the comparative effectiveness of different data (source dataset, translated dataset, and code-switched dataset). The results demonstrate that the proposed method has a certain improvement in the three tasks of XABSA, distillation XABSA and MABSA. For reproducibility, our code for this paper is available at https://github.com/GKLMIP/CL-XABSA.
翻译:作为自然语言处理(NLP)领域的一项广泛研究,基于方位情绪分析(ABSA)的任务是预测文本中相对相应内容表达的情绪。不幸的是,大多数语言缺乏足够的批注资源,因此最近越来越多的研究人员侧重于跨语言的方位情绪分析(XABSA)。然而,最近的一些研究只侧重于跨语言的数据匹配,而不是模型匹配。为此,我们提议了一个新颖的框架,即CL-XABSA:跨语言显示显示器的对立学习。根据对比学习,我们关闭了不同语义空间中同一标签的样本之间的距离,从而实现不同语言语义的语义拼写空间的趋同。具体地说,我们设计了两种对比战略,象征性嵌入(TL-CTE)和感知度对比学习(SL-CTE),以调整调用于跨语言的源位语言和目标语言的语义空间。由于我们的框架可以接收多种语言的数据数据集,因此, XBSA 的比值分析框架不能改进我们的数据。