Funding agencies are largely relied on a topic matching between domain experts and research proposals to assign proposal reviewers. As proposals are increasingly interdisciplinary, it is challenging to profile the interdisciplinary nature of a proposal, and, thereafter, find expert reviewers with an appropriate set of expertise. An essential step in solving this challenge is to accurately model and classify the interdisciplinary labels of a proposal. Existing methodological and application-related literature, such as textual classification and proposal classification, are insufficient in jointly addressing the three key unique issues introduced by interdisciplinary proposal data: 1) the hierarchical structure of discipline labels of a proposal from coarse-grain to fine-grain, e.g., from information science to AI to fundamentals of AI. 2) the heterogeneous semantics of various main textual parts that play different roles in a proposal; 3) the number of proposals is imbalanced between non-interdisciplinary and interdisciplinary research. Can we simultaneously address the three issues in understanding the proposal's interdisciplinary nature? In response to this question, we propose a hierarchical mixup multiple-label classification framework, which we called H-MixUp. H-MixUp leverages a transformer-based semantic information extractor and a GCN-based interdisciplinary knowledge extractor for the first and second issues. H-MixUp develops a fused training method of Wold-level MixUp, Word-level CutMix, Manifold MixUp, and Document-level MixUp to address the third issue.
翻译:供资机构主要依赖领域专家与指派建议书审查员的研究建议之间的专题匹配; 由于提案越来越具有跨学科性质,因此很难描述提案的跨学科性质,然后找到具有适当专门知识的专家审查员; 应对这一挑战的一个重要步骤是准确建模和分类提案的跨学科标签; 现有的方法和与应用有关的文献,例如文本分类和提案分类,不足以共同解决跨学科提案数据提出的三个独特的关键问题:(1) 将从粗麦到精细(例如从信息科学到AI至AI基本内容)的提案的学科标签分级结构;(2) 各种主要文本部分的混杂语义,在提案中起到不同作用;(3) 提案数量在非跨学科和跨学科研究之间不平衡; 我们能否同时解决三个问题,了解提案的跨学科性质? 针对这一问题,我们提出了一个等级混合的多标签框架,我们称之为H-MixUp。 H-Mix-MUp-MUp 一级,将基于G-MUM-MUM-M-MUM-MUL-M-MUL-MUL-M-MUL-MUI-MUL-MUL-MUI-MUI-M-M-MUL-M-M-MUL-MULUI-M-M-M-M-MUI-M-MUI-M-M-M-M-M-M-M-MI-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-M-