Research articles are being shared in increasing numbers on multiple online platforms. Although the scholarly impact of these articles has been widely studied, the online interest determined by how long the research articles are shared online remains unclear. Being cognizant of how long a research article is mentioned online could be valuable information to the researchers. In this paper, we analyzed multiple social media platforms on which users share and/or discuss scholarly articles. We built three clusters for papers, based on the number of yearly online mentions having publication dates ranging from the year 1920 to 2016. Using the online social media metrics for each of these three clusters, we built machine learning models to predict the long-term online interest in research articles. We addressed the prediction task with two different approaches: regression and classification. For the regression approach, the Multi-Layer Perceptron model performed best, and for the classification approach, the tree-based models performed better than other models. We found that old articles are most evident in the contexts of economics and industry (i.e., patents). In contrast, recently published articles are most evident in research platforms (i.e., Mendeley) followed by social media platforms (i.e., Twitter).
翻译:越来越多的研究文章在多个在线平台上分享。虽然这些文章的学术影响得到了广泛的研究,但由研究文章在网上共享的时间长度决定的在线兴趣仍然不清楚。意识到在线提及研究文章的时间可能给研究人员带来宝贵的信息。在本文件中,我们分析了用户共享和/或讨论学术文章的多种社交媒体平台。我们根据每年在线提及出版日期从1920年到2016年的数量,建立了三组论文。我们利用这三组内容中的每个组内容的在线社交媒体指标,建立了机器学习模型,以预测研究文章的长期在线兴趣。我们用两种不同的方法处理了预测任务:回归和分类。对于回归方法,多莱尔 Percepron模式表现最佳,对于分类方法,基于树的模型表现优于其他模型。我们发现,旧文章在经济和产业(即专利)背景下最为明显。相比之下,最近发表的文章在研究平台(即Mendeley)中最为明显,社会媒体平台(即Twitter)遵循。