使用随机区块模型评估英国总理大联盟四十多个季度的竞争平衡情况 (Assessing competitive balance in the English Premier League for over forty seasons using a stochastic block model)

Competitive balance is a desirable feature in any professional sports league and encapsulates the notion that there is unpredictability in the outcome of games as opposed to an imbalanced league in which the outcome of some games are more predictable than others, for example, when an apparent strong team plays against a weak team. In this paper, we develop a model-based clustering approach to provide an assessment of the balance between teams in a league. We propose a novel Bayesian model to represent the results of a football season as a dense network with nodes identified by teams and categorical edges representing the outcome of each game. The resulting stochastic block model facilitates the probabilistic clustering of teams to assess whether there are competitive imbalances in a league. A key question then is to assess the uncertainty around the number of clusters or blocks and consequently estimation of the partition or allocation of teams to blocks. To do this, we develop an MCMC algorithm that allows the joint estimation of the number of blocks and the allocation of teams to blocks. We apply our model to each season in the English premier league from $1978/79$ to $2019/20$. A key finding of this analysis is evidence which suggests a structural change from a reasonably balanced league to a two-tier league which occurred around the early 2000's.

翻译：竞争平衡是任何专业体育联盟中一个理想的特征,它概括了一种概念,即游戏结果的不可预测性,而不是不平衡的联盟,在这种不平衡的联盟中,某些游戏的结果比其他游戏的结果更可预测,例如,当一个明显的强大团队与一个弱小团队对一个弱小团队起作用时。在本文件中,我们开发了一个基于模型的集群方法,以便对一个球队之间的平衡进行评估。我们提出了一个新型的贝叶西亚模式,将足球赛季的结果表现为一个密集的网络,由球队确定节点和代表每个球队结果的绝对边缘。由此形成的软体区块模型有利于将球队的概率组合,以评估一个球队中是否存在竞争不平衡。然后,一个关键问题是评估围绕一个球队或球队的分布或分配到一个球队的不确定性。为了做到这一点,我们开发了一个MC算法,以便联合估计一个足球赛季的数量和球队对各个球队的分配情况。我们把我们的模型应用到每个赛季的英联赛季,从1978/79美元到2019/20美元。这一分析的关键发现是表明一个结构级的早期联盟发生了结构上的转变。