Finding an expert plays a crucial role in driving successful collaborations and speeding up high-quality research development and innovations. However, the rapid growth of scientific publications and digital expertise data makes identifying the right experts a challenging problem. Existing approaches for finding experts given a topic can be categorised into information retrieval techniques based on vector space models, document language models, and graph-based models. In this paper, we propose $\textit{ExpFinder}$, a new ensemble model for expert finding, that integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $\textit{$\mu$CO-HITS}$, that is a proposed variation of the CO-HITS algorithm. The key of $n$VSM is to exploit recent inverse document frequency weighting method for $N$-gram words and $\textit{ExpFinder}$ incorporates $n$VSM into $\textit{$\mu$CO-HITS}$ to achieve expert finding. We comprehensively evaluate $\textit{ExpFinder}$ on four different datasets from the academic domains in comparison with six different expert finding models. The evaluation results show that $\textit{ExpFinder}$ is a highly effective model for expert finding, substantially outperforming all the compared models in 19% to 160.2%.
翻译:寻找专家在推动成功合作和加速高质量研究开发和创新方面发挥着关键作用。然而,科学出版物和数字专门知识数据的迅速增长使得确定正确的专家成为一个具有挑战性的问题。查找专家的现有方法可以分为基于矢量空间模型、文件语言模型和图表模型的信息检索技术。在本文中,我们提议用美元作为专家发现的新合用模式,将新型的美元-克矢量空间模型(美元-克)作为美元-VSM)和图表模型(美元/克-HITS)作为确定正确专家的难题。根据美元/克-HITS算法的拟议变异。美元/克-HITS的关键是利用最新的文件频率加权方法,用于美元-克字字字和美元/克字{分析},将美元-克元/克元/克元/克元/克-希特 空间模型(美元-克-西特-西特)纳入美元-克-克模式,以美元/克元/克-西特-克-西特)作为计算专家发现的专家发现的专家结果。我们全面评价$/克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-德-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-克-德-克-克-克-克-德-克-克-克-德-德-克-克-克-克-克-克-克-。