利用深层学习,“科学网”中研究领域等级分类 (Hierarchical Classification of Research Fields in the "Web of Science" Using Deep Learning)

The scholarly publication space is growing steadily not just in numbers but also in complexity due to collaboration between individuals from within and across fields of research. This paper presents a hierarchical classification system that automatically categorizes a scholarly publication using its abstract into a three-tier hierarchical label set of fields (discipline-field-subfield). This system enables a holistic view about the interdependence of research activities in the mentioned hierarchical tiers in terms of knowledge production through articles and impact through citations. The classification system (44 disciplines - 738 fields - 1,501 subfields) utilizes and is able to cope with 160 million abstract snippets in Microsoft Academic Graph (Version 2018-05-17) using batch training in a modularized and distributed fashion to address and assess interdisciplinarity and inter-field classifications. In addition, we have explored multi-class classifications in both the single-label and multi-label settings. In total, we have conducted 3,140 experiments, in all models (Convolutional Neural Networks, Recurrent Neural Networks, Transformers), the classification accuracy is > 90% in 77.84% and 78.83% of the single-label and multi-label classifications, respectively. We examine the advantages of our classification by its ability to better align research texts and output with disciplines, to adequately classify them in an automated way, as well as to capture the degree of interdisciplinarity in a publication which enables downstream analytics such as field interdisciplinarity. This system (a set of pretrained models) can serve as a backbone to an interactive system of indexing scientific publications.

翻译：学术出版空间不仅在数量上稳步增长,而且由于研究领域内部和跨领域个人之间的协作而日益复杂。本文件展示了一个等级分类系统,将学术出版物用抽象内容自动分类成三层等级标签组合(纪律-实地-子领域)。这个系统使人们得以从整体上看待上述等级层次的研究活动在通过文章和影响创造知识方面的相互依存性。分类系统(44个学科-738字段-1 501个字段)利用并能够应付微软学术图(2018-05-17版本)的1.6亿个抽象片段。它采用模块化和分布式的形式,将学术出版物自动分类,处理和评估不同性和不同领域之间的分类。此外,我们在单级标签和多标签环境中都探讨了多等级研究活动的相互依存性。我们在所有模型(革命神经网络、复合神经网络、变异体)中共进行了3 140个实验,分类准确性指数大于77.84%和78.83%。我们通过将单一标签和多标签分类体系的实地分类化培训,从而将这种分类和多等级的系统作为一种更精确的分类,可以用来将它们作为一种更精确的分类,作为一种更精确的版本的分类。我们的出版物,通过一种更精确的分类的方式,将它们作为一种更精确的实地的分类,作为一种更精确的分类,作为一种程度的、更精确的检索的分类,用来用来用来进行。我们的一种分类。

相关内容

Neural Networks

关注 1649

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日