Estimation-of-distribution algorithms (EDAs) are optimization algorithms that learn a distribution on the search space from which good solutions can be sampled easily. A key parameter of most EDAs is the sample size (population size). If the population size is too small, the update of the probabilistic model builds on few samples, leading to the undesired effect of genetic drift. Too large population sizes avoid genetic drift, but slow down the process. Building on a recent quantitative analysis of how the population size leads to genetic drift, we design a smart-restart mechanism for EDAs. By stopping runs when the risk for genetic drift is high, it automatically runs the EDA in good parameter regimes. Via a mathematical runtime analysis, we prove a general performance guarantee for this smart-restart scheme. This in particular shows that in many situations where the optimal (problem-specific) parameter values are known, the restart scheme automatically finds these, leading to the asymptotically optimal performance. We also conduct an extensive experimental analysis. On four classic benchmark problems, we clearly observe the critical influence of the population size on the performance, and we find that the smart-restart scheme leads to a performance close to the one obtainable with optimal parameter values. Our results also show that previous theory-based suggestions for the optimal population size can be far from the optimal ones, leading to a performance clearly inferior to the one obtained via the smart-restart scheme. We also conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial optimization problems from the literature, the max-cut problem and the bipartition problem. Again, we observe that the smart-restart mechanism finds much better values for the population size than those suggested in the literature, leading to a much better performance.
翻译:估计分布算法(EDAs)是一种优化算法,它从搜索空间中学习分布,从中可以容易地采样到优秀的解决方案。大多数EDAs的关键参数是样本大小(种群大小)。如果种群大小过小,则概率模型的更新基于少量样本,导致遗传漂移现象。种群大小过大则可以避免遗传漂移,但会减缓进程。在最近对种群大小如何导致遗传漂移进行的定量分析的基础上,我们设计了一种EDAs的智能重启机制。当遗传漂移风险较高时,通过停止运行,它可以自动在良好的参数区域运行EDA。通过数学运行时间分析,我们证明了智能重启方案的普适性能保证。这特别表明,在许多已知最佳(问题特定)参数值的情况下,重启方案可以自动找到这些值,从而达到渐近最优的性能。我们还进行了广泛的实验分析。在四个经典基准问题上,我们清楚地观察到种群大小对性能的关键影响,并发现智能重启机制可以实现接近最优参数值的性能。我们的结果还表明,以前基于理论的关于最佳种群大小的建议可能远非最佳,导致性能明显低于通过智能重启方案获得的性能。我们还在文献中的两个组合优化问题(最大割问题和二分割问题)上使用PBIL(交叉熵算法)进行实验。同样,我们观察到智能重启机制可以找到比文献中建议的还要好得多的种群大小值,从而得到更好的性能。