Processor designs rely on iterative modifications and reuse well-established designs. However, this reuse of prior designs also leads to similar vulnerabilities across multiple processors. As processors grow increasingly complex with iterative modifications, efficiently detecting vulnerabilities from modern processors is critical. Inspired by software fuzzing, hardware fuzzing has recently demonstrated its effectiveness in detecting processor vulnerabilities. Yet, to our best knowledge, existing processor fuzzers fuzz each design individually, lacking the capability to understand known vulnerabilities in prior processors to fine-tune fuzzing to identify similar or new variants of vulnerabilities. To address this gap, we present ReFuzz, an adaptive fuzzing framework that leverages contextual bandit to reuse highly effective tests from prior processors to fuzz a processor-under-test (PUT) within a given ISA. By intelligently mutating tests that trigger vulnerabilities in prior processors, ReFuzz effectively detects similar and new variants of vulnerabilities in PUTs. ReFuzz uncovered three new security vulnerabilities and two new functional bugs. ReFuzz detected one vulnerability by reusing a test that triggers a known vulnerability in a prior processor. One functional bug exists across three processors that share design modules. The second bug has two variants. Additionally, ReFuzz reuses highly effective tests to enhance efficiency in coverage, achieving an average 511.23x coverage speedup and up to 9.33% more total coverage, compared to existing fuzzers.
翻译:处理器设计依赖于迭代修改并重用成熟的设计方案。然而,这种对先前设计的重用也导致多个处理器之间存在相似的漏洞。随着处理器通过迭代修改变得日益复杂,高效检测现代处理器中的漏洞至关重要。受软件模糊测试的启发,硬件模糊测试最近已证明其在检测处理器漏洞方面的有效性。然而,据我们所知,现有处理器模糊测试工具均独立地对每个设计进行测试,缺乏理解先前处理器中已知漏洞的能力,无法通过微调模糊测试来识别相似或新型漏洞变体。为填补这一空白,我们提出ReFuzz——一种基于上下文赌博机的自适应模糊测试框架,该框架能够重用先前处理器中高效的测试用例,在给定指令集架构(ISA)内对待测处理器(PUT)进行模糊测试。通过智能地变异那些在先前处理器中触发漏洞的测试用例,ReFuzz能有效检测待测处理器中相似及新型的漏洞变体。ReFuzz发现了三个新的安全漏洞和两个新的功能缺陷。其中,一个漏洞是通过重用触发先前处理器已知漏洞的测试用例而检测到的。一个功能缺陷存在于三个共享设计模块的处理器中。第二个功能缺陷具有两种变体。此外,ReFuzz通过重用高效测试用例提升了覆盖效率,与现有模糊测试工具相比,平均实现511.23倍的覆盖率加速,总覆盖率最高提升9.33%。