Frequentist statistical methods, such as hypothesis testing, are standard practice in papers that provide benchmark comparisons. Unfortunately, these methods have often been misused, e.g., without testing for their statistical test assumptions or without controlling for family-wise errors in multiple group comparisons, among several other problems. Bayesian Data Analysis (BDA) addresses many of the previously mentioned shortcomings but its use is not widely spread in the analysis of empirical data in the evolutionary computing community. This paper provides three main contributions. First, we motivate the need for utilizing Bayesian data analysis and provide an overview of this topic. Second, we discuss the practical aspects of BDA to ensure that our models are valid and the results transparent. Finally, we provide five statistical models that can be used to answer multiple research questions. The online appendix provides a step-by-step guide on how to perform the analysis of the models discussed in this paper, including the code for the statistical models, the data transformations and the discussed tables and figures.
翻译:通常的统计方法,例如假设测试,是提供基准比较的文件中的标准做法。不幸的是,这些方法经常被滥用,例如,没有对其统计测试假设进行测试,或没有在多组比较中控制家庭错误等若干其他问题。巴伊西亚数据分析(BDA)处理许多先前提到的缺点,但在分析进化计算界的经验数据时没有广泛使用。本文提供了三个主要贡献。首先,我们提出需要利用巴伊西亚数据分析,并对此专题进行概述。第二,我们讨论了巴伊西亚数据分析的实际方面,以确保我们的模型是有效的,结果是透明的。最后,我们提供了五个统计模型,可用来回答多种研究问题。在线附录就如何分析本文讨论的模型,包括统计模型的代码、数据转换以及讨论过的表格和数字,提供了分步骤指南。