CPUs are becoming more complex with every generation, at both the logical and the physical levels. This potentially leads to more logic bugs and electrical defects in CPUs being overlooked during testing, which causes data corruption or other undesirable effects when these CPUs are used in production. These ever-present problems may also have simply become more evident as more CPUs are operated and monitored by large cloud providers. If the RTL ("source code") of a CPU were available, we could apply greybox fuzzing to the CPU model almost as we do to any other software [arXiv:2102.02308]. However our targets are general purpose x86_64 CPUs produced by third parties, where we do not have the RTL design, so in our case CPU implementations are opaque. Moreover, we are more interested in electrical defects as opposed to logic bugs. We present SiliFuzz, a work-in-progress system that finds CPU defects by fuzzing software proxies, like CPU simulators or disassemblers, and then executing the accumulated test inputs (known as the corpus) on actual CPUs on a large scale. The major difference between this work and traditional software fuzzing is that a software bug fixed once will be fixed for all installations of the software, while for CPU defects we have to test every individual core repeatedly over its lifetime due to wear and tear. In this paper we also analyze four groups of CPU defects that SiliFuzz has uncovered and describe patterns shared by other SiliFuzz findings.
翻译:在逻辑和物理层面上,每一代的CPU都变得日益复杂。这可能导致CPU中更多的逻辑错误和电气缺陷在测试过程中被忽视,造成数据腐败或其他不良效应,当这些CPU被用于生产时,这些始终存在的问题可能更加明显,因为更多的CPU是由大的云源提供商操作和监测的。如果CPU的 RTL (“源代码”) 在逻辑和物理水平上都有,我们几乎可以像对任何其他软件[arXiv: 2102308]那样,对CPU模型应用灰盒模糊。然而,我们的目标则是第三方产生的通用的x86_64 CPU, 从而在生产这些CPU时造成数据腐败或其他不良效应。这些始终存在的问题可能更加明显,因为更多的CPU是由大的云源运行者操作者操作和监测的。我们介绍的是SiliFU(“源代码”代码),一个工作进展中找到CPU缺陷的系统,它像CPU模拟者或拆卸的系统一样,然后执行由第三方产生的CUPU的通用目的 CUU, 并且我们每次在常规的固定的系统测试中都会在每一个的系统上 。