The number of scientific publications continues to rise exponentially, especially in Computer Science (CS). However, current solutions to analyze those publications restrict access behind a paywall, offer no features for visual analysis, limit access to their data, only focus on niches or sub-fields, and/or are not flexible and modular enough to be transferred to other datasets. In this thesis, we conduct a scientometric analysis to uncover the implicit patterns hidden in CS metadata and to determine the state of CS research. Specifically, we investigate trends of the quantity, impact, and topics for authors, venues, document types (conferences vs. journals), and fields of study (compared to, e.g., medicine). To achieve this we introduce the CS-Insights system, an interactive web application to analyze CS publications with various dashboards, filters, and visualizations. The data underlying this system is the DBLP Discovery Dataset (D3), which contains metadata from 5 million CS publications. Both D3 and CS-Insights are open-access, and CS-Insights can be easily adapted to other datasets in the future. The most interesting findings of our scientometric analysis include that i) there has been a stark increase in publications, authors, and venues in the last two decades, ii) many authors only recently joined the field, iii) the most cited authors and venues focus on computer vision and pattern recognition, while the most productive prefer engineering-related topics, iv) the preference of researchers to publish in conferences over journals dwindles, v) on average, journal articles receive twice as many citations compared to conference papers, but the contrast is much smaller for the most cited conferences and journals, and vi) journals also get more citations in all other investigated fields of study, while only CS and engineering publish more in conferences than journals.
翻译:科学出版物的数量继续急剧上升,特别是在计算机科学(CS)中。然而,目前分析这些出版物的解决方案仍然在急剧上升,特别是在计算机科学(CS)中。然而,目前分析这些出版物的解决方案限制了作者、地点、文件类型(会议与期刊相比)以及研究领域(与医学相比)的数量、影响和议题的趋势,没有提供视觉分析功能,限制对数据的访问,只关注利基或子领域,而且/或者不够灵活和模块化,不足以传输到其他数据集。在这个论文中,我们进行了科学分析,以发现CS元数据中隐藏的隐含模式,并确定CS研究的现状。具体地说,我们调查这些出版物的数量、影响和议题的趋势,作者、网站、文件类型(会议与期刊相比杂志)和研究领域(与医学相比,没有提供任何特征特征分析),最近对CS-Inchights系统(CS-Invisions)系统进行了互动应用,用各种仪表、过滤器和视觉化数据系统的数据基础是D3,其中仅包含500万种CS出版物的元数据的元数据。 DS-S-Iightserviews)都比较容易查阅,而大部分的读者和C-Inviewserviews 和C-viewserviews redududududududududududududududududududududududududu redudududuvaldals 和C-views 和C-views 和C-vidudududududududududududududududududududududududududududududucless 。最近两次在阅读的期刊的论文中,而最近的作者都很容易地在阅读的论文中,在阅读的论文中,而最近两次的作者对了其他刊物的论文的论文的作者和最近的论文中,在阅读的论文中,在阅读的论文和最近两次的论文中,在阅读的研判法系的文献分析中,最近两次中,在研究中,在阅读的论文中可以更深入的文献中进行著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著著