Representation learning is the first step in automating tasks such as research paper recommendation, classification, and retrieval. Due to the accelerating rate of research publication, together with the recognised benefits of interdisciplinary research, systems that facilitate researchers in discovering and understanding relevant works from beyond their immediate school of knowledge are vital. This work explores different methods of research paper representation (or document embedding), to identify those methods that are capable of preserving the interdisciplinary implications of research papers in their embeddings. In addition to evaluating state of the art methods of document embedding in a interdisciplinary citation prediction task, we propose a novel Graph Neural Network architecture designed to preserve the key interdisciplinary implications of research articles in citation network node embeddings. Our proposed method outperforms other GNN-based methods in interdisciplinary citation prediction, without compromising overall citation prediction performance.
翻译:代表性学习是使研究论文建议、分类和检索等任务自动化的第一步。由于研究出版速度加快,加上跨学科研究的公认好处,促进研究人员发现和了解超出其直接知识学校范围的相关作品的系统至关重要。这项工作探索了不同的研究论文表述方法(或文件嵌入),以确定能够在其嵌入中保留研究论文的跨学科影响的方法。除了评估将文件嵌入跨学科引用预测任务的最新文件方法外,我们还提议了一个新的图表神经网络结构,旨在保存引用网络网嵌入中研究文章的关键跨学科影响。我们提议的方法在跨学科引用预测中优于以GNN为基础的其他方法,同时不影响总体引用预测性能。