When a software transformation or software security task needs to analyze a given program binary, the first step is often disassembly. Since many modern disassemblers have become highly accurate on many binaries, we believe reliable disassembler benchmarking requires standardizing the set of binaries used and the disassembly ground truth about these binaries. This paper presents (i) a first version of our work-in-progress disassembly benchmark suite, which comprises 879 binaries from diverse projects compiled with multiple compilers and optimization settings, and (ii) a novel disassembly ground truth generator leveraging the notion of "listing files", which has broad support by Clang, GCC, ICC, and MSVC. In additional, it presents our evaluation of four prominent open-source disassemblers using this benchmark suite and a custom evaluation system. Our entire system and all generated data are maintained openly on GitHub to encourage community adoption.
翻译:当软件转换或软件安全任务需要分析某一程序二进制时,第一步往往是拆卸。由于许多现代拆卸器在许多二进制中已经变得高度准确,我们认为可靠的拆卸器基准要求使用一套双进制标准化和这些二进制的拆卸地真象。本文展示了(一) 我们正在进行的工作拆卸基准套件的第一版,其中包括由多个编译者和优化设置汇编的不同项目的879个二进制,以及(二) 利用 " 列名文件 " 概念的新颖的拆卸地面真象生成器,该概念得到了Clang、GCC、ICC和MSVC的广泛支持。此外,它展示了我们对使用这个基准套件和定制评价系统的四大开源拆卸器的评价。我们的整个系统和所有生成的数据都公开保存在 GitHub 上,以鼓励社区采用。