With the advent of multilingual models like mBART, mT5, IndicBART etc., summarization in low resource Indian languages is getting a lot of attention now a days. But still the number of datasets is low in number. In this work, we (Team HakunaMatata) study how these multilingual models perform on the datasets which have Indian languages as source and target text while performing summarization. We experimented with IndicBART and mT5 models to perform the experiments and report the ROUGE-1, ROUGE-2, ROUGE-3 and ROUGE-4 scores as a performance metric.
翻译:随着mBART、mT5、IndicBART等多语言模型的出现,现在越来越多的人关注低资源印度语言的总结。但是,数据集的数量仍然很少。在这项工作中,我们(HakunaMatata团队)研究了这些多语言模型在源文本和目标文本都是印度语言的数据集上执行总结的性能。我们使用IndicBART和mT5模型进行实验,并报告ROUGE-1、ROUGE-2、ROUGE-3和ROUGE-4分数作为性能指标。