在多语言新闻头条标题中加强政治极地预测的普通常识语言-不可知学习框架 (A Commonsense-Infused Language-Agnostic Learning Framework for Enhancing Prediction of Political Polarity in Multilingual News Headlines)

Predicting the political polarity of news headlines is a challenging task that becomes even more challenging in a multilingual setting with low-resource languages. To deal with this, we propose to utilise the Inferential Commonsense Knowledge via a Translate-Retrieve-Translate strategy to introduce a learning framework. To begin with, we use the method of translation and retrieval to acquire the inferential knowledge in the target language. We then employ an attention mechanism to emphasise important inferences. We finally integrate the attended inferences into a multilingual pre-trained language model for the task of bias prediction. To evaluate the effectiveness of our framework, we present a dataset of over 62.6K multilingual news headlines in five European languages annotated with their respective political polarities. We evaluate several state-of-the-art multilingual pre-trained language models since their performance tends to vary across languages (low/high resource). Evaluation results demonstrate that our proposed framework is effective regardless of the models employed. Overall, the best performing model trained with only headlines show 0.90 accuracy and F1, and 0.83 jaccard score. With attended knowledge in our framework, the same model show an increase in 2.2% accuracy and F1, and 3.6% jaccard score. Extending our experiments to individual languages reveals that the models we analyze for Slovenian perform significantly worse than other languages in our dataset. To investigate this, we assess the effect of translation quality on prediction performance. It indicates that the disparity in performance is most likely due to poor translation quality. We release our dataset and scripts at: https://github.com/Swati17293/KG-Multi-Bias for future research. Our framework has the potential to benefit journalists, social scientists, news producers, and consumers.

翻译：预测新闻头条新闻的政治极极性是一项具有挑战性的任务,在多语种和低资源语言的多语言环境中,这一任务甚至更具挑战性。为了解决这个问题,我们提议通过一个翻译-Retrieve-Translate 战略,利用“推断公域公域知识”来引入学习框架。首先,我们使用翻译和检索方法,以目标语言获取推断知识。然后我们使用关注机制,强调重要的推理。我们最终将所接受的推论纳入一个多语言的预培训前语言模型,用于偏见预测。为了评估我们框架的有效性,我们建议我们用五个欧洲语言提供一套超过62.6K的多语言新闻头条数据集,并配有各自的政治极性。我们评估了几个最先进的多语言的多语言模式,因为语言的表现往往因语言的不同而不同而异(低/高资源)。我们随后运用了一个关注机制来强调我们提出的框架是有效的。总体而言,我们所培训的最佳表现模型显示的是,我们比头线准确度的准确度和F1,还有0.83 贾卡德得分分分数。我们最可能用的是,我们在框架里卡路路路里的数据,我们做了一个更精确的成绩分析。我们的数据,我们在框架里比比比我们更精确的模型,我们更精确。我们更精确。我们做了一个框架,在F2.2.2.218的成绩,我们在框架里,在Sl的比得得分。我们的数据里,我们在框架里,在比得分。我们的数据里,在F2.218。