Scientific digital libraries play a critical role in the development and dissemination of scientific literature. Despite dedicated search engines, retrieving relevant publications from the ever-growing body of scientific literature remains challenging and time-consuming. Indexing scientific articles is indeed a difficult matter, and current models solely rely on a small portion of the articles (title and abstract) and on author-assigned keyphrases when available. This results in a frustratingly limited access to scientific knowledge. The goal of the DELICES project is to address this pitfall by exploiting semantic relations between scientific articles to both improve and enrich indexing. To this end, we will rely on the latest advances in semantic representations to both increase the relevance of keyphrases extracted from the documents, and extend indexing to new terms borrowed from semantically similar documents.
翻译:科学数字图书馆在科学文献的开发和传播方面发挥着关键作用。尽管有专门的搜索引擎,但从不断增加的科学文献中检索相关出版物仍具有挑战性和耗时性。科学文章的索引的确是一个困难的问题,目前的模型完全依赖文章的一小部分(标题和摘要)和现有的作者指定的关键词句。这导致获得科学知识的机会有限,令人沮丧。DELICES项目的目标是通过利用科学文章之间的语义关系来改进和丰富索引。为此,我们将依靠语义表述的最新进展来增加从文件中提取的关键词句的相关性,并将索引扩大到从语义类似的文件中借用的新术语。