Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this rich language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce ParsiNLU, the first benchmark in Persian language that includes a range of high-level tasks -- Reading Comprehension, Textual Entailment, etc. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5$k$ new instances across 6 distinct NLU tasks. Besides, we present the first results on state-of-the-art monolingual and multi-lingual pre-trained language-models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope ParsiNLU fosters further research and advances in Persian language understanding.
翻译:尽管近年来在应对自然语言理解(NLU)挑战方面取得了进展,但大部分进展仍然集中在资源丰富的语言上,如英语。这项工作侧重于波斯语,这是世界上广泛使用的语言之一,但这种丰富语言的NLU数据集很少。高质量的评价数据集对于可靠评估不同国家语言理解(NLU)任务和领域的进展十分必要。我们引入了ParsiNLU,这是波斯语的第一个基准,其中包括一系列高级任务 -- -- 阅读理解、文字细节等。这些数据集是以多种方式收集的,往往涉及当地语言的手动说明。这导致在6项不同的NLU任务中出现14.5千美元的新案例。此外,我们介绍关于这一基准的目前最先进的单一语言和多语言预先培训的语言模型的初步结果,并将其与人文表现进行比较,这些结果为我们应对波斯语自然语言理解挑战的能力提供了宝贵的洞察力。我们希望ParsiNLU能够促进对波斯语理解的进一步研究和进步。