Based on an exponentially increasing number of academic articles, discovering and citing comprehensive and appropriate resources has become a non-trivial task. Conventional citation recommender methods suffer from severe information loss. For example, they do not consider the section of the paper that the user is writing and for which they need to find a citation, the relatedness between the words in the local context (the text span that describes a citation), or the importance on each word from the local context. These shortcomings make such methods insufficient for recommending adequate citations to academic manuscripts. In this study, we propose a novel embedding-based neural network called "dual attention model for citation recommendation (DACR)" to recommend citations during manuscript preparation. Our method adapts embedding of three dimensions of semantic information: words in the local context, structural contexts, and the section on which a user is working. A neural network is designed to maximize the similarity between the embedding of the three input (local context words, section and structural contexts) and the target citation appearing in the context. The core of the neural network is composed of self-attention and additive attention, where the former aims to capture the relatedness between the contextual words and structural context, and the latter aims to learn the importance of them. The experiments on real-world datasets demonstrate the effectiveness of the proposed approach.
翻译:在大量增加的学术文章的基础上,发现和引用全面而适当的资源已成为一项非三重任务。常规引用建议方法造成了严重的信息损失。例如,它们不考虑用户正在撰写并需要找到引用的文件中的章节、当地语境中的文字(描述引用的文字范围)之间的关联性,或当地语境中每个词的重要性。这些缺陷使这种方法不足以建议适当引用学术文稿。在这项研究中,我们提议建立一个新型嵌入的神经网络,称为“引用建议双引力模式(DACR)”,以建议编写手稿时引用。我们的方法不考虑文体信息的三个方面:当地语境中的文字、结构背景和用户正在工作的章节。神经网络旨在最大限度地扩大三种投入(当地语境中的文字、部分和结构背景背景背景背景)与背景引用的相似性。神经网络的核心是自我关注和添加性关注,以展示其结构性,从而显示其真实性,从而显示其真实性,从而显示其真实性。