A tournament graph is a complete directed graph, which can be used to model a round-robin tournament between $n$ players. In this paper, we address the problem of finding a champion of the tournament, also known as Copeland winner, which is a player that wins the highest number of matches. In detail, we aim to investigate algorithms that find the champion by playing a low number of matches. Solving this problem allows us to speed up several Information Retrieval and Recommender System applications, including question answering, conversational search, etc. Indeed, these applications often search for the champion inducing a round-robin tournament among the players by employing a machine learning model to estimate who wins each pairwise comparison. Our contribution, thus, allows finding the champion by performing a low number of model inferences. We prove that any deterministic or randomized algorithm finding a champion with constant success probability requires $\Omega(\ell n)$ comparisons, where $\ell$ is the number of matches lost by the champion. We then present an asymptotically-optimal deterministic algorithm matching this lower bound without knowing $\ell$, and we extend our analysis to three variants of the problem. Lastly, we conduct a comprehensive experimental assessment of the proposed algorithms on a question answering task on public data. Results show that our proposed algorithms speed up the retrieval of the champion up to $13\times$ with respect to the state-of-the-art algorithm that perform the full tournament.
翻译:锦标赛图是一种完全有向图,用于模拟 $n$ 个参赛者之间的循环赛。本文研究在锦标赛图中寻找冠军的问题,即 Copeland 获胜者;他是赢得最多比赛的选手。具体而言,我们旨在研究通过进行少量比赛就能找到冠军的算法。解决这个问题可以加速多个信息检索和推荐系统应用,包括问答、对话搜索等。实际上,这些应用程序经常通过机器学习模型来估计每个配对比赛的胜负,从而使得能够在选手之间引入循环赛,并搜索冠军。我们的贡献是通过进行少量模型推断即可找到冠军。我们证明了任何确定性或随机化算法都需要 $\Omega (\ell n)$ 的比较次数,才能以恒定的成功概率找到冠军,其中 $\ell$ 是冠军输掉的比赛局数。然后,我们提出了一种渐近最优的确定性算法,它不需要知道 $\ell$,并在三个问题变量下扩展了我们的分析。最后,在公共数据上进行了广泛的实验评估。结果表明,相对于执行完整的锦标赛的最新算法,我们提出的算法将冠军的检索加速了多达 $13\times$。