Over the past decade, tremendous progress has been made in inventing new RecSys methods. However, one of the fundamental problems of the RecSys research community remains the lack of applied datasets and benchmarks with well-defined evaluation rules and metrics to test these novel approaches. In this article, we present the TTRS - Tinkoff Transactions Recommender System benchmark. This financial transaction benchmark contains over 2 million interactions between almost 10,000 users and more than 1,000 merchant brands over 14 months. To the best of our knowledge, this is the first publicly available financial transactions dataset. To make it more suitable for possible applications, we provide a complete description of the data collection pipeline, its preprocessing, and the resulting dataset statistics. We also present a comprehensive comparison of the current popular RecSys methods on the next-period recommendation task and conduct a detailed analysis of their performance against various metrics and datasets.
翻译:过去十年来,在发明新的RecSys方法方面取得了巨大进展,然而,RecSys研究界的根本问题之一仍然是缺乏应用数据集和基准,缺乏明确界定的评价规则和衡量标准,以测试这些新办法;在本条中,我们介绍了TTRS-Tinkoff交易建议系统基准;这一金融交易基准包含14个月中近10 000个用户和1 000多个商业品牌之间的200多万个互动;据我们所知,这是第一个公开提供的金融交易数据集;为了使之更适合可能的应用,我们全面介绍了数据收集管道、预处理和由此产生的数据集统计数据;我们还全面比较了当前流行的Recys方法,以了解下期建议任务,并对照各种指标和数据集详细分析其业绩。