Forecasting the number of Olympic medals for each nation is highly relevant for different stakeholders: Ex ante, sports betting companies can determine the odds while sponsors and media companies can allocate their resources to promising teams. Ex post, sports politicians and managers can benchmark the performance of their teams and evaluate the drivers of success. To significantly increase the Olympic medal forecasting accuracy, we apply machine learning, more specifically a two-staged Random Forest, thus outperforming more traditional na\"ive forecast for three previous Olympics held between 2008 and 2016 for the first time. Regarding the Tokyo 2020 Games in 2021, our model suggests that the United States will lead the Olympic medal table, winning 120 medals, followed by China (87) and Great Britain (74). Intriguingly, we predict that the current COVID-19 pandemic will not significantly alter the medal count as all countries suffer from the pandemic to some extent (data inherent) and limited historical data points on comparable diseases (model inherent).
翻译:预测每个国家的奥林匹克奖牌数量对不同的利益攸关方都具有高度相关性:事先,体育赌博公司可以确定胜算,而主办者和媒体公司可以将其资源分配给有前途的团队。前,体育政治家和管理人员可以对其团队的业绩进行基准测试,并评估成功驱动力。为了大幅提高奥林匹克奖牌预测的准确性,我们应用机器学习,更具体地说是一个分为两阶段的随机森林,从而在2008年至2016年首次举行的前三次奥运会上比以往更传统的流行性预测要强。关于2021年东京2020年奥运会,我们的模型表明,美国将领导奥林匹克奖牌桌,赢得120枚奖牌,随后是中国(87)和英国(74 ) 。 奇怪的是,我们预测目前的COVID-19大流行不会显著改变奖牌数,因为所有国家都在某种程度上遭受这一流行病之害(数据是内在的),而且关于可比疾病的历史数据点有限(典型的固有特征 ) 。