按照多重标准学习:游戏理论观点 (Preference learning along multiple criteria: A game-theoretic perspective)

The literature on ranking from ordinal data is vast, and there are several ways to aggregate overall preferences from pairwise comparisons between objects. In particular, it is well known that any Nash equilibrium of the zero sum game induced by the preference matrix defines a natural solution concept (winning distribution over objects) known as a von Neumann winner. Many real-world problems, however, are inevitably multi-criteria, with different pairwise preferences governing the different criteria. In this work, we generalize the notion of a von Neumann winner to the multi-criteria setting by taking inspiration from Blackwell's approachability. Our framework allows for non-linear aggregation of preferences across criteria, and generalizes the linearization-based approach from multi-objective optimization. From a theoretical standpoint, we show that the Blackwell winner of a multi-criteria problem instance can be computed as the solution to a convex optimization problem. Furthermore, given random samples of pairwise comparisons, we show that a simple plug-in estimator achieves near-optimal minimax sample complexity. Finally, we showcase the practical utility of our framework in a user study on autonomous driving, where we find that the Blackwell winner outperforms the von Neumann winner for the overall preferences.

翻译：关于从正统数据排名的文献非常广泛,从对象之间的对等比较中可以综合总体偏好。特别是,众所周知, 优惠矩阵引发的零和游戏的任何纳什平衡都界定了一个自然解决方案概念(对对象的分布分布), 称为冯纽曼赢家。然而,许多现实世界问题必然是多标准,对等偏好不同标准。在这项工作中,我们从Blackwell的可接近性的角度,将冯纽曼赢家的概念推广到多标准设置。我们的框架允许非线性地汇总各种标准的偏好,并将基于线性化的方法从多目标优化中概括化。我们从理论角度表明,多标准问题实例的布莱克威尔赢家可以被计算成对等优化问题的解决方案。此外,从对对等比较的随机抽样来看,我们显示简单的顶点算器能够达到近于最佳的微缩缩抽样复杂度。最后, 我们展示了我们框架在自主驱动用户研究中的实际效用, 我们发现黑市赢家胜出先令的先令。