状态灰盒模糊 (Stateful Greybox Fuzzing)

Many protocol implementations are reactive systems, where the protocol process is in continuous interaction with other processes and the environment. If a bug can be exposed only in a certain state, a fuzzer needs to provide a specific sequence of events as inputs that would take protocol into this state before the bug is manifested. We call these bugs as "stateful" bugs. Usually, when we are testing a protocol implementation, we do not have a detailed formal specification of the protocol to rely upon. Without knowledge of the protocol, it is inherently difficult for a fuzzer to discover such stateful bugs. A key challenge then is to cover the state space without an explicit specification of the protocol. In this work, we posit that manual annotations for state identification can be avoided for stateful protocol fuzzing. Specifically, we rely on a programmatic intuition that the state variables used in protocol implementations often appear in enum type variables whose values (the state names) come from named constants. In our analysis of the Top-50 most widely used open-source protocol implementations, we found that every implementation uses state variables that are assigned named constants (with easy to comprehend names such as INIT, READY) to represent the current state. In this work, we propose to automatically identify such state variables and track the sequence of values assigned to them during fuzzing to produce a "map" of the explored state space. Our experiments confirm that our stateful fuzzer discovers stateful bugs twice as fast as the baseline greybox fuzzer that we extended. Starting from the initial state, our fuzzer exercises one order of magnitude more state/transition sequences and covers code two times faster than the baseline fuzzer. Several zero-day bugs in prominent protocol implementations were found by our fuzzer, and 8 CVEs have been assigned.

翻译：许多协议执行是反应系统, 协议进程是与其他进程和环境持续互动的。如果一个错误仅在某个状态下才能暴露出来, 模糊器需要提供具体的事件序列, 作为输入程序, 在显示错误之前将协议引入此状态。我们将这些错误称为“ 明显” 错误。通常, 当我们测试协议执行时, 我们没有协议需要依赖的详细正式规格。没有协议知识, 一个模糊器就很难发现这种臭虫。那么关键的挑战就是覆盖国家空间而不明确说明协议。在这项工作中, 一个模糊器需要提供具体的事件序列, 作为输入协议的输入程序。我们把这些错误称为“ 明显” 错误错误错误。具体地说, 我们把这些错误称为“ 明显” 错误错误错误。我们使用的程序通常显示在输入的字符类型变量中, 这些变量的值( 状态名称) 。我们对最广泛使用的公开源协议执行状态的分析发现, 每一个执行都使用指定了一种状态变量( 最明显易被命名为“ 模糊” 定义的代码, 在运行过程中, 运行的代码是“ 快速” 。。在运行中, 更新的顺序是“ 我们的顺序是“ 更新” 。, 。快速运行中, 我们的顺序是“, 更新到。更新的顺序是“ 我们的顺序是“ 我们的顺序” 。