Even when machine learning systems surpass human ability in a domain, there are many reasons why AI systems that capture human-like behavior would be desirable: humans may want to learn from them, they may need to collaborate with them, or they may expect them to serve as partners in an extended interaction. Motivated by this goal of human-like AI systems, the problem of predicting human actions -- as opposed to predicting optimal actions -- has become an increasingly useful task. We extend this line of work by developing highly accurate personalized models of human behavior in the context of chess. Chess is a rich domain for exploring these questions, since it combines a set of appealing features: AI systems have achieved superhuman performance but still interact closely with human chess players both as opponents and preparation tools, and there is an enormous amount of recorded data on individual players. Starting with an open-source version of AlphaZero trained on a population of human players, we demonstrate that we can significantly improve prediction of a particular player's moves by applying a series of fine-tuning adjustments. Furthermore, we can accurately perform stylometry -- predicting who made a given set of actions -- indicating that our personalized models capture human decision-making at an individual level.
翻译:即使当机器学习系统在一个领域超越了人的能力时,也有许多原因可以说明为什么需要采用人工智能系统来捕捉类似人类的行为:人类可能希望从这些系统中学习:人类可能希望学习这些系统,他们可能需要与这些系统合作,或者期望它们成为长期互动的伙伴。受类似人工智能系统这一目标的驱使,预测人类行动的问题 -- -- 而不是预测最佳行动 -- -- 已成为一项日益有用的任务。我们通过在象棋中开发高度精确的个人化的人类行为模型来扩展这一工作线。象棋是探索这些问题的丰富领域,因为它结合了一套吸引人的特征:人工智能系统已经取得了超人性的表现,但仍与人类象棋玩家作为对手和准备工具进行密切互动,还有大量关于个体玩家的记录数据。从人类玩家接受的阿尔法泽罗的开放源版本开始,我们通过一系列微调调整来大大改进对特定玩家的动作的预测。此外,我们可以准确地进行测量,预测谁做了一套特定动作 -- -- 表明我们的个人模型的个体决策。