The disambiguation of author names is an important and challenging task in bibliometrics. We propose an approach that relies on an external source of information for selecting and validating clusters of publications identified through an unsupervised author name disambiguation method. The application of the proposed approach to a random sample of Italian scholars shows encouraging results, with an overall precision, recall, and F-Measure of over 96%. The proposed approach can serve as a starting point for large-scale census of publication portfolios for bibliometric analyses at the level of individual researchers.
翻译:作者姓名的模糊性是数学中一项重要和具有挑战性的任务。我们建议采用一种方法,在选择和验证通过未经监督的作者名称模糊性方法确定的各类出版物时,依靠外部信息来源。对意大利学者随机抽样采用拟议方法,显示了令人鼓舞的结果,总体精确性、回溯性和F-计量率超过96%。拟议方法可以作为在个体研究人员一级对出版物组合进行大规模普查的起点,用于进行双曲线分析。