进化语法模糊 (Evolutionary Grammar-Based Fuzzing)

A fuzzer provides randomly generated inputs to a targeted software to expose erroneous behavior. To efficiently detect defects, generated inputs should conform to the structure of the input format and thus, grammars can be used to generate syntactically correct inputs. In this context, fuzzing can be guided by probabilities attached to competing rules in the grammar, leading to the idea of probabilistic grammar-based fuzzing. However, the optimal assignment of probabilities to individual grammar rules to effectively expose erroneous behavior for individual systems under test is an open research question. In this paper, we present EvoGFuzz, an evolutionary grammar-based fuzzing approach to optimize the probabilities to generate test inputs that may be more likely to trigger exceptional behavior. The evaluation shows the effectiveness of EvoGFuzz in detecting defects compared to probabilistic grammar-based fuzzing (baseline). Applied to ten real-world applications with common input formats (JSON, JavaScript, or CSS3), the evaluation shows that EvoGFuzz achieved a significantly larger median line coverage for all subjects by up to 48% compared to the baseline. Moreover, EvoGFuzz managed to expose 11 unique defects, from which five have not been detected by the baseline.

翻译：模糊器向目标软件提供随机生成的投入,以暴露错误行为。为了有效发现缺陷,生成的投入应该符合输入格式的结构,因此,语法可以用来生成同步正确投入。在这方面,模糊可以以与语法中相竞争规则相联的概率为指导,从而产生概率比比照法语法模糊性的概念。然而,将概率与个人语法规则的最佳分配,以有效暴露测试中单个系统的错误行为,这是一个开放的研究问题。在本文中,我们介绍了EvoGFuzz,一种基于语法的进化模糊法方法,以优化生成测试投入的概率,这种概率更有可能触发特殊行为。评估表明EvoGouzz在发现缺陷方面的效力,与以概率法为基础的语法模糊性(基线)相比。应用到具有共同输入格式的十种真实应用(JSON, JavaScript, 或CSS3),评估表明EvoGUFUzz在从一个显著的测算到EGF的基底线上没有达到一个比EGU的48的更大程度。