We develop BenchPress, the first ML benchmark generator for compilers that is steerable within feature space representations of source code. BenchPress synthesizes compiling functions by adding new code in any part of an empty or existing sequence by jointly observing its left and right context, achieving excellent compilation rate. BenchPress steers benchmark generation towards desired target features that has been impossible for state of the art synthesizers (or indeed humans) to reach. It performs better in targeting the features of Rodinia benchmarks in 3 different feature spaces compared with (a) CLgen - a state of the art ML synthesizer, (b) CLSmith fuzzer, (c) SRCIROR mutator or even (d) human-written code from GitHub. BenchPress is the first generator to search the feature space with active learning in order to generate benchmarks that will improve a downstream task. We show how using BenchPress, Grewe's et al. CPU vs GPU heuristic model can obtain a higher speedup when trained on BenchPress's benchmarks compared to other techniques. BenchPress is a powerful code generator: Its generated samples compile at a rate of 86%, compared to CLgen's 2.33%. Starting from an empty fixed input, BenchPress produces 10x more unique, compiling OpenCL benchmarks than CLgen, which are significantly larger and more feature diverse.
翻译:我们开发了BenchPress(BenchPress),这是在源代码空间特征显示范围内指导编纂者的第一个ML基准生成器。BenchPress(BenchPress)通过在空或现有序列中的任何部分添加新代码来合成各项功能,方法是共同观测其左侧和右侧背景,从而实现极好的编译率。BenchPress(BenchPress)引导基准的生成,以达到艺术合成者(或事实上的人类)所无法达到的预期目标特征。与(a) CLgen(CLgen) -- -- 艺术 ML合成者(b) CLSmit fuzzer (c) SRCIROR muter(c) SRCIROR muter 甚或甚至(d) GitHubs(d) 的人写代码, 从而通过积极学习生成基准来改进下游任务。我们展示了如何使用TenPress(Grewe) et al. CPU vs Heurist exmodustric 模型来获得比其他技术更多样化的基准更高的速度。 。Bress(CM) ) 更强大的Clisterpresseral%) 和CFrecuducrecuducrecudududucleglex 10 更强大的CPress 。