Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions. However, creating a down-sample randomly might exclude important cases from the current down-sample for a number of generations, while cases that measure the same behavior (synonymous cases) may be overused despite their redundancy. In this work, we introduce Informed Down-Sampled Lexicase Selection. This method leverages population statistics to build down-samples that contain more distinct and therefore informative training cases. Through an empirical investigation across two different GP systems (PushGP and Grammar-Guided GP), we find that informed down-sampling significantly outperforms random down-sampling on a set of contemporary program synthesis benchmark problems. Through an analysis of the created down-samples, we find that important training cases are included in the down-sample consistently across independent evolutionary runs and systems. We hypothesize that this improvement can be attributed to the ability of Informed Down-Sampled Lexicase Selection to maintain more specialist individuals over the course of evolution, while also benefiting from reduced per-evaluation costs.
翻译:随机降序法选择仅对随机的一组培训案例对个人进行评估,使更多的人能够以同样数量的方案处决方式进行探索。然而,随机降序样样可能会将一些重要案例排除在目前降序样样样之外,而测量相同行为(同义案例)的案例尽管有冗余,却可能被过度使用。在这项工作中,我们发现重要的培训案例可以列入下游样中,在独立的进化过程和选择系统之间,我们通过对两种不同的GP系统(PushGP和Grammar-Guided GP)进行实证调查,发现知情的降序样明显地比一套现代方案综合基准问题的随机降序样样要差。我们通过对创建的降序样分析,我们发现重要的培训案例可以列入下游样中,这些降序样样样样样样样可以持续地包含在独立进化过程和升级系统之间。我们还发现,在降低个人进化进化进化过程和升级能力的同时,可以将这种改进归因于专家的进化过程。