Citation recommendation systems have attracted much academic interest, resulting in many studies and implementations. These systems help authors automatically generate proper citations by suggesting relevant references based on the text they have written. However, the methods used in citation recommendation differ across various studies and implementations. Some approaches focus on the overall content of papers, while others consider the context of the citation text. Additionally, the datasets used in these studies include different aspects of papers, such as metadata, citation context, or even the full text of the paper in various formats and structures. The diversity in models, datasets, and evaluation metrics makes it challenging to assess and compare citation recommendation methods effectively. To address this issue, a standardized dataset and evaluation metrics are needed to evaluate these models consistently. Therefore, we propose developing a benchmark specifically designed to analyze and compare citation recommendation models. This benchmark will evaluate the performance of models on different features of the citation context and provide a comprehensive evaluation of the models across all these tasks, presenting the results in a standardized way. By creating a benchmark with standardized evaluation metrics, researchers and practitioners in the field of citation recommendation will have a common platform to assess and compare different models. This will enable meaningful comparisons and help identify promising approaches for further research and development in the field.
翻译:暂无翻译