Benchmarking is a key aspect of research into optimization algorithms, and as such the way in which the most popular benchmark suites are designed implicitly guides some parts of algorithm design. One of these suites is the black-box optimization benchmarking (BBOB) suite of 24 single-objective noiseless functions, which has been a standard for over a decade. Within this problem suite, different instances of a single problem can be created, which is beneficial for testing the stability and invariance of algorithms under transformations. In this paper, we investigate the BBOB instance creation protocol by considering a set of 500 instances for each BBOB problem. Using exploratory landscape analysis, we show that the distribution of landscape features across BBOB instances is highly diverse for a large set of problems. In addition, we run a set of eight algorithms across these 500 instances, and investigate for which cases statistically significant differences in performance occur. We argue that, while the transformations applied in BBOB instances do indeed seem to preserve the high-level properties of the functions, their difference in practice should not be overlooked, particularly when treating the problems as box-constrained instead of unconstrained.
翻译:基准是优化算法研究的一个关键方面,因此,最受欢迎的基准套件设计的方式暗含地指导了算法设计的某些部分。其中一个套件是黑盒优化基准(BBOB)套件(BBOB)24个单一目标无噪音功能,这是十多年来的标准。在这个问题套件中,可以产生不同的单一问题实例,这有利于测试转型中的算法的稳定性和易变性。在本文中,我们通过考虑每个BBBB问题都有500个案例来调查BBB例创建协议。我们利用探索性地貌分析,表明BBB的场景特征在一系列大问题上的分布差异很大。此外,我们在这500个案例中运行了一套八种算法,并调查哪些案件在业绩上存在统计上的重大差异。我们争辩说,尽管在BBBB例中应用的变异性似乎确实保存了该函数的高性能,但是在实践上的差异不应被忽视,特别是当将问题作为受框限制而不是不受限制的问题处理时。