We present an analysis of the loss of population-level test coverage induced by different down-sampling strategies when combined with lexicase selection. We study recorded populations from the first generation of genetic programming runs, as well as entirely synthetic populations. Our findings verify the hypothesis that informed down-sampling better maintains population-level test coverage when compared to random down-sampling. Additionally, we show that both forms of down-sampling cause greater test coverage loss than standard lexicase selection with no down-sampling. However, given more information about the population, we found that informed down-sampling can further reduce its test coverage loss. We also recommend wider adoption of the static population analyses we present in this work.
翻译:我们对结合词典选择的不同下采样策略在群体级测试覆盖率损失方面进行了分析。我们研究了遗传编程运行的第一代录制群体以及完全合成的群体。我们的研究验证了知情下采样在维护群体级测试覆盖率方面比随机下采样更好的假设。此外,我们发现两种形式的下采样都会比没有下采样的标准词典选择导致更大的测试覆盖率损失。然而,如果获得有关群体更多的信息,我们发现知情下采样可以进一步降低测试覆盖率损失。我们还建议广泛采用我们在本文中提出的静态群体分析方法。