Non-parametric tests can determine the better of two stochastic optimization algorithms when benchmarking results are ordinal, like the final fitness values of multiple trials. For many benchmarks, however, a trial can also terminate once it reaches a pre-specified target value. When only some trials reach the target value, two variables characterize a trial's outcome: the time it takes to reach the target value (or not) and its final fitness value. This paper describes a simple way to impose linear order on this two-variable trial data set so that traditional non-parametric methods can determine the better algorithm when neither dominates. We illustrate the method with the Mann-Whitney U-test. A simulation demonstrates that U-scores are much more effective than dominance when tasked with identifying the better of two algorithms. We test U-scores by having them determine the winners of the CEC 2022 Special Session and Competition on Real-Parameter Numerical Optimization.
翻译:非参数测试可以确定在基准结果具有正统性时两种随机优化算法的优缺点,比如多重试验的最终健身值。 但是,对于许多基准,试验一旦达到预定目标值,也可以终止。 当只有某些试验达到目标值时, 试验的结果有两个变量特征: 达到目标值( 或不是) 所需的时间及其最终健身值。 本文描述了对这套两种可变试验数据集施加线性命令的简单方法, 这样传统的非参数方法就可以在两者都无法支配时确定更好的算法。 我们用曼维特尼U测试来说明这种方法。 模拟显示, 当指定两种计算法的优劣性时, 铀计比主导法有效得多。 我们通过让它们确定 CEC 2022 特别会议的赢家和真实- Parater Numerical Opimization 竞赛来测试U 。