S2AND:作者名称差异的基准和评估系统 (S2AND: A Benchmark and Evaluation System for Author Name Disambiguation)

Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library applications such as search and citation analysis. While many AND algorithms have been proposed, comparing them is difficult because they often employ distinct features and are evaluated on different datasets. In response to this challenge, we present S2AND, a unified benchmark dataset for AND on scholarly papers, as well as an open-source reference model implementation. Our dataset harmonizes eight disparate AND datasets into a uniform format, with a single rich feature set drawn from the Semantic Scholar S2 database. Our evaluation suite for S2AND reports performance split by facets like publication year and number of papers, allowing researchers to track both global performance and measures of fairness across facet values. Our experiments show that because previous datasets tend to cover idiosyncratic and biased slices of the literature, algorithms trained to perform well on one on them may generalize poorly to others. By contrast, we show how training on a union of datasets in S2AND results in more robust models that perform well even on datasets unseen in training. The resulting AND model also substantially improves over the production algorithm in S2, reducing error by over 50% in terms of B^3 F1. We release our unified dataset, model code, trained models, and evaluation suite to the research community. https://github.com/allenai/S2AND/

翻译：作者姓名 Disambiguation (AND) 是解决任务的任务, 作者在书目数据库中提及, 作者在书目数据库中提及, 是同一个真实世界的人, 并且是数字图书馆应用程序( 如搜索和引证分析) 的关键组成部分。虽然提出了许多和算法, 但比较它们是很困难的, 因为它们通常使用不同的特性, 并在不同的数据集中进行评估。我们提出S2AND, 是一个统一的文献文献和学术论文的基准数据集, 以及一个开放源码参考模型的实施。我们的数据集将八个不同的数据集和数据集统一成一个统一格式, 由Smantititic 学者S2 S2AND 数据库中抽出一个单一的丰富功能集。我们的 S2AND 评估套件报告业绩按出版年份和文件数量等不同方面分列, 使研究人员能够跟踪全球业绩和衡量面值公平度的衡量标准。我们的实验显示, 由于以前的数据集往往涵盖文献的特异和偏颇的切片段, 被训练的算法可能不及他人。。对比, 我们的关于S2AND 数据组合的模型评估组合的组合, 的模型, 的模型, 的模型的模型的模型的精确的模型, 的模型的模型的模型的模型的模型的模型比的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的精确性。