Fuzzing is one of the most effective approaches to finding software flaws. However, applying it to microcontroller firmware incurs many challenges. For example, rehosting-based solutions cannot accurately model peripheral behaviors and thus cannot be used to fuzz the corresponding driver code. In this work, we present $\mu$AFL, a hardware-in-the-loop approach to fuzzing microcontroller firmware. It leverages debugging tools in existing embedded system development to construct an AFL-compatible fuzzing framework. Specifically, we use the debug dongle to bridge the fuzzing environment on the PC and the target firmware on the microcontroller device. To collect code coverage information without costly code instrumentation, $\mu$AFL relies on the ARM ETM hardware debugging feature, which transparently collects the instruction trace and streams the results to the PC. However, the raw ETM data is obscure and needs enormous computing resources to recover the actual instruction flow. We therefore propose an alternative representation of code coverage, which retains the same path sensitivity as the original AFL algorithm, but can directly work on the raw ETM data without matching them with disassembled instructions. To further reduce the workload, we use the DWT hardware feature to selectively collect runtime information of interest. We evaluated $\mu$AFL on two real evaluation boards from two major vendors: NXP and STMicroelectronics. With our prototype, we discovered ten zero-day bugs in the driver code shipped with the SDK of STMicroelectronics and three zero-day bugs in the SDK of NXP. Eight CVEs have been allocated for them. Considering the wide adoption of vendor SDKs in real products, our results are alarming.
翻译:模糊是找到软件缺陷的最有效方法之一。 但是, 将它应用到微控制器公司软件中会遇到很多挑战。 例如, 重置解决方案无法准确地模拟外围行为, 因而无法用来模糊相应的驱动代码。 在此工作中, 我们展示了 $\ mu$AFL, 这是一种在操作中进行软操作的方法, 用来模糊微控制器公司软件的功能。 它利用现有的嵌入系统开发中调试工具, 以构建一个 AFL 兼容的模糊框架。 具体地说, 我们使用调试软件来弥补 PC 和 微控制器设备的目标公司软件的模糊环境。 为了收集代码覆盖信息, 没有昂贵的代码仪表, $\ mu$AFL 则依靠 ARM ETM 硬件调试功能, 透明地收集教学跟踪结果并将结果传送到 PC。 然而, 原始的 ETM 数据很模糊, 需要大量计算资源来恢复实际的指令流。 因此, 我们提议一个代号的代号覆盖的代号代表, 与原始的 STL 和 lical 驱动器的驱动器软件软件软件软件中保持相同的路径敏感度环境。 但是, 我们直接用的是, IM IM IM IM 数据在两个代号的代号的代号的代号的代号的代号的代号 。