计算机系统文件的引文分析 (Citation Analysis of Computer Systems Papers)

Citation analysis is used extensively in the bibliometrics literature to assess the impact of individual works, researchers, institutions, and even entire fields of study. In this paper, we analyze citations in one large and influential field within computer science, namely computer systems. Using citation data from a cross-sectional sample of 2,088 papers in 50 systems conferences from 2017, we examine four research questions: overall distribution of systems citations; their evolution over time; the differences between databases (Google Scholar and Scopus) for systems papers, and; the characteristics of self-citations in the field. We find that only 1.5% of papers remain uncited after five years, while 12.8% accrued at least 100 citations, both statistics comparing favorably to many other scientific fields. The most cited subfields and conference areas within systems were security, databases, and computer architecture. Most papers achieved their first citation within a year from publication, and the median citation count continued to grow at an almost linear rate over five years, with only a few papers peaking before that. We also find that early citations could be linked to papers with a freely available preprint, or may be primarily composed of self-citations. The ratio of self-citations to total citations starts relatively high for most papers but appears to stabilize by 12--18 months, at which point highly cited papers revert to predominately external citations. Past self-citation count (taken from each paper's reference list) appears to bear little if any relationship with the future self-citation count of each paper. The choice of citation database also makes little difference in relative citation comparisons, despite marked differences in absolute counts.

翻译：在生物测定文献中广泛使用引文分析来评估个别作品、研究人员、机构、甚至整个研究领域的影响。在本文中,我们分析了计算机科学中一个大领域(即计算机系统)的引文。我们使用2017年50次系统会议中2 088份论文的跨部门抽样引用数据,我们研究了四个研究问题:系统引文的总体分布;其随时间演变;系统文件数据库(谷歌学者和斯科普斯)之间的差异;以及实地自评的特征。我们发现,五年后只有1.5%的论文仍未被引用,而12.8%的论文至少被100次引用,两者的统计数据都与其他许多科学领域进行比较。系统中引用最多的次领域和会议领域是安全、数据库和计算机结构。大多数文件在出版一年之内首次被引用,而中位引用数在五年内继续以几乎线性的速度增长,而此前只有几篇论文达到峰值。我们还发现,早期引用的文件可以与文件相联系,在可自由获取的自译本前每份样本中累积至少100次引用,而自译自评为12个月,从头开始自评为自评为最高。自评文件的总比例,从头看,从头看,从自评为自评为12次自评为自评为自评为自评为近数,从自评为近为12次。自评为12次。自评为近为自评。自评文件总自评。