Real-world competitive games, such as chess, go, or StarCraft II, rely on Elo models to measure the strength of their players. Since these games are not fully transitive, using Elo implicitly assumes they have a strong transitive component that can correctly be identified and extracted. In this study, we investigate the challenge of identifying the strength of the transitive component in games. First, we show that Elo models can fail to extract this transitive component, even in elementary transitive games. Then, based on this observation, we propose an extension of the Elo score: we end up with a disc ranking system that assigns each player two scores, which we refer to as skill and consistency. Finally, we propose an empirical validation on payoff matrices coming from real-world games played by bots and humans.
翻译:现实世界竞争游戏,例如国际象棋、国际象棋或StarCraft II等,依靠Elo模型来衡量其球员的实力。由于这些游戏不是完全过渡性的,使用Elo暗含地假设它们有一个强大的中转部分,可以正确识别和提取。在这项研究中,我们调查了确定游戏中中中中转部分的实力的挑战。首先,我们证明Elo模型不能提取这种中转部分,即使在初级中转游戏中也是如此。然后,我们根据这一观察,建议扩大Elo分:我们最终采用一个盘盘分级系统,将每个球员分到两分,我们称之为技能和一致性。最后,我们建议对由机器人和人类玩的现实世界游戏所产生的报酬矩阵进行实证验证。