One of the key difficulties in using estimation-of-distribution algorithms is choosing the population size(s) appropriately: Too small values lead to genetic drift, which can cause enormous difficulties. In the regime with no genetic drift, however, often the runtime is roughly proportional to the population size, which renders large population sizes inefficient. Based on a recent quantitative analysis which population sizes lead to genetic drift, we propose a parameter-less version of the compact genetic algorithm that automatically finds a suitable population size without spending too much time in situations unfavorable due to genetic drift. We prove a mathematical runtime guarantee for this algorithm and conduct an extensive experimental analysis on four classic benchmark problems both without and with additive centered Gaussian posterior noise. The former shows that under a natural assumption, our algorithm has a performance very similar to the one obtainable from the best problem-specific population size. The latter confirms that missing the right population size in the original cGA can be detrimental and that previous theory-based suggestions for the population size can be far away from the right values; it also shows that our algorithm as well as a previously proposed parameter-less variant of the cGA based on parallel runs avoid such pitfalls. Comparing the two parameter-less approaches, ours profits from its ability to abort runs which are likely to be stuck in a genetic drift situation.
翻译:使用分布算法的关键困难之一是适当选择人口规模:太小的数值会导致基因漂移,这会造成巨大的困难。但是,在没有基因漂移的制度中,运行时间往往与人口规模大致成正比,这使得大量人口规模效率低下。根据最近对人口规模导致基因漂移的定量分析,我们提议了一个没有参数的缩略基因算法版本,自动发现合适的人口规模,而不会在基因漂移导致的不利情况下花费太多时间。我们证明这种算法是一个数学运行时间的保证,对四种典型的基准问题进行广泛的实验分析,既没有,也没有添加了高山海后海象噪音。前者表明,在自然假设下,我们的算法的性能与从最佳问题特定人口规模中获得的功能非常相似。后者证实,原始基因组的正确人口规模可能有害,而先前对人口规模提出的理论性建议可能远远远离正确的价值;它还表明,我们的算法以及先前提出的两个没有参数的变异种,即从原先的基因变种法可能与我们的基因变变法相平行地运行,从而避免了我们的基因变轨迹。