Given partially observed pairwise comparison data generated by the Bradley-Terry-Luce (BTL) model, we study the problem of top-$k$ ranking. That is, to optimally identify the set of top-$k$ players. We derive the minimax rate with respect to a normalized Hamming loss. This provides the first result in the literature that characterizes the partial recovery error in terms of the proportion of mistakes for top-$k$ ranking. We also derive the optimal signal to noise ratio condition for the exact recovery of the top-$k$ set. The maximum likelihood estimator (MLE) is shown to achieve both optimal partial recovery and optimal exact recovery. On the other hand, we show another popular algorithm, the spectral method, is in general sub-optimal. Our results complement the recent work by Chen et al. (2019) that shows both the MLE and the spectral method achieve the optimal sample complexity for exact recovery. It turns out the leading constants of the sample complexity are different for the two algorithms. Another contribution that may be of independent interest is the analysis of the MLE without any penalty or regularization for the BTL model. This closes an important gap between theory and practice in the literature of ranking.
翻译:根据部分观测到的Bradley-Terriy-Luce(BTL)模型生成的对称比较数据,我们研究了最高-美元排名的问题。也就是说,最佳地确定最高-美元玩家的一组美元牌。我们得出了正常的Hamming损失的迷你算法率。这提供了文献中的第一个结果,这些文献在最高-美元排名错误比例方面对部分回收错误进行了描述。我们还为准确回收最高-美元套件得出了对噪声比率条件的最佳信号。显示最高-美元分数的最大可能性估计值(MLE)是为了实现最佳的部分回收和最佳的精确回收。另一方面,我们展示了另一种流行算法,即光谱法,是一般的次最佳方法。我们的结果补充了陈等人最近的工作(2019年),其中显示最低生活成本和光谱方法都达到了准确回收的最佳样本复杂性。结果显示,抽样复杂性的主要常数对两种算法是不同的。另一个独立的兴趣贡献可能是对MLE的模型进行差别分析,而没有对BT的这种重要的理论和定型之间有差距。