This study aims to provide a data-driven approach for empirically tuning and validating rating systems, focusing on the Elo system. Well-known rating frameworks, such as Elo, Glicko, TrueSkill systems, rely on parameters that are usually chosen based on probabilistic assumptions or conventions, and do not utilize game-specific data. To address this issue, we propose a methodology that learns optimal parameter values by maximizing the predictive accuracy of match outcomes. The proposed parameter-tuning framework is a generalizable method that can be extended to any rating system, even for multiplayer setups, through suitable modification of the parameter space. Implementation of the rating system on real and simulated gameplay data demonstrates the suitability of the data-driven rating system in modeling player performance.
翻译:本研究旨在提供一种数据驱动的方法,用于经验性地调优和验证评分系统,重点关注Elo系统。诸如Elo、Glicko、TrueSkill等知名评分框架依赖于通常基于概率假设或惯例选择的参数,并未利用特定游戏数据。为解决此问题,我们提出了一种通过最大化比赛结果预测准确性来学习最优参数值的方法。所提出的参数调优框架是一种可推广的方法,通过对参数空间进行适当修改,可扩展到任何评分系统,甚至适用于多人游戏设置。在真实和模拟的游戏数据上实施该评分系统,证明了数据驱动评分系统在建模玩家表现方面的适用性。