The Arabic Citation Index (ARCI) was launched in 2020. This article provides an overview of the scientific literature contained in this new database and explores its possible usage in research evaluation. As of May 2022, ARCI had indexed 138,283 scientific publications published between 2015 and 2020. ARCI's coverage is characterised by using the metadata available in scientific publications. First, I investigate the distributions of the indexed literature at various levels (research domains, countries, languages, open access). Articles make up nearly all the documents indexed with a share of 99% of ARCI. The Arts & Humanities and Social Sciences fields have the highest concentration of publications. Most indexed journals are published in Egypt, Algeria, Iraq, Jordan, and Saudi Arabia. About 8% of publications in ARCI are published in languages other than Arabic. Second, I use an unsupervised machine learning model, LDA (Latent Dirichlet Allocation), and the text mining algorithm of VOSviewer to uncover the main topics in ARCI. These methods provide a better understanding of ARCI's thematic structure. Next, I discuss how ARCI can complement global standards in the context of a more inclusive research evaluation. Finally, I suggest a few research opportunities after discussing the findings of this study.
翻译:阿拉伯引文索引(ARCI)于2020年推出。本文概述了这一新数据库中包含的科学文献,并探讨了其在研究评估中的潜在用途。截至2022年5月,ARCI已索引了2015年至2020年间发布的138,283篇科学出版物。 ARCI的覆盖范围以使用科学出版物中可用的元数据为特点。首先,我调查了被索引文献在各个层面上(研究领域、国家、语言、开放获取)的分布情况。论文占索引文档的几乎所有份额,占ARCI的99%。艺术人文和社会科学领域的出版物最为集中。大多数被索引的期刊是在埃及、阿尔及利亚、伊拉克、约旦和沙特阿拉伯出版的。大约8%的ARCI出版物是以阿拉伯语以外的语言出版的。第二,我使用无监督的机器学习模型LDA(Latent Dirichlet Allocation)和VOSviewer的文本挖掘算法来揭示ARCI的主要主题。这些方法提供了更好地了解ARCI的主题结构。接下来,我讨论了ARCI在更具包容性的研究评估背景下如何补充全球标准。最后,我在讨论本研究结果后提出了一些研究机会。