The binary code similarity detection (BCSD) method measures the similarity of two binary executable codes. Recently, the learning-based BCSD methods have achieved great success, outperforming traditional BCSD in detection accuracy and efficiency. However, the existing studies are rather sparse on the adversarial vulnerability of the learning-based BCSD methods, which cause hazards in security-related applications. To evaluate the adversarial robustness, this paper designs an efficient and black-box adversarial code generation algorithm, namely, FuncFooler. FuncFooler constrains the adversarial codes 1) to keep unchanged the program's control flow graph (CFG), and 2) to preserve the same semantic meaning. Specifically, FuncFooler consecutively 1) determines vulnerable candidates in the malicious code, 2) chooses and inserts the adversarial instructions from the benign code, and 3) corrects the semantic side effect of the adversarial code to meet the constraints. Empirically, our FuncFooler can successfully attack the three learning-based BCSD models, including SAFE, Asm2Vec, and jTrans, which calls into question whether the learning-based BCSD is desirable.
翻译:二元代码相似性检测(BCSD)方法测量了两种二元执行代码的相似性。最近,基于学习的BCSD方法取得了巨大成功,在检测准确性和效率方面优于传统的BCSD方法;然而,现有研究对于基于学习的BCSD方法的对抗性脆弱性相当少,这些方法在与安全有关的应用中造成危险。为评价对抗性强力,本文件设计了一种高效的黑箱对抗代码生成算法,即FuncFooler。FuncFooler限制对抗性代码 1 1 以保持程序控制流程图(CFG)和2 的不变,以保持相同的语义含义。具体地说,FuncFooler 连续1 确定了恶意代码中的脆弱候选人,2 选择并插入了良性代码中的对抗性指令, 3 纠正了对抗性代码的语义侧效应,以适应制约。我们FuncFooler 成功地攻击了三种基于学习的BCSDM模型,包括SAFE、Asm2Vec和jTrans,这是否可取。