Prediction of the future performance of academic journals is a task that can benefit a variety of stakeholders including editorial staff, publishers, indexing services, researchers, university administrators and granting agencies. Using historical data on journal performance, this can be framed as a machine learning regression problem. In this work, we study two such regression tasks: 1) prediction of the number of citations a journal will receive during the next calendar year, and 2) prediction of the Elsevier CiteScore a journal will be assigned for the next calendar year. To address these tasks, we first create a dataset of historical bibliometric data for journals indexed in Scopus. We propose the use of neural network models trained on our dataset to predict the future performance of journals. To this end, we perform feature selection and model configuration for a Multi-Layer Perceptron and a Long Short-Term Memory. Through experimental comparisons to heuristic prediction baselines and classical machine learning models, we demonstrate superior performance in our proposed models for the prediction of future citation and CiteScore values.
翻译:对学术期刊未来业绩的预测是一项能够使各种利益攸关方受益的任务,包括编辑人员、出版商、索引编制服务、研究人员、大学行政人员和授标机构。利用关于期刊业绩的历史数据,可以将此设计成一个机器学习回归问题。在这项工作中,我们研究两个这样的回归任务:1)预测下一日历年将收到一份期刊的引文数量;2)为下一日历年分配一份《Elsevier CiteScore》期刊的预测。为完成这些任务,我们首先为在斯科普斯索引的期刊创建一套历史比目数据。我们提议使用在数据集方面受过培训的神经网络模型来预测期刊的未来业绩。为此,我们为多语言 Perpheron和长时期记忆进行特征选择和模型配置。通过对超理论预测基线和古典机器学习模型的实验性比较,我们展示了我们预测未来引用和CiteScore值的拟议模型的优异性表现。