Automatically detecting software vulnerabilities in source code is an important problem that has attracted much attention. In particular, deep learning-based vulnerability detectors, or DL-based detectors, are attractive because they do not need human experts to define features or patterns of vulnerabilities. However, such detectors' robustness is unclear. In this paper, we initiate the study in this aspect by demonstrating that DL-based detectors are not robust against simple code transformations, dubbed attacks in this paper, as these transformations may be leveraged for malicious purposes. As a first step towards making DL-based detectors robust against such attacks, we propose an innovative framework, dubbed ZigZag, which is centered at (i) decoupling feature learning and classifier learning and (ii) using a ZigZag-style strategy to iteratively refine them until they converge to robust features and robust classifiers. Experimental results show that the ZigZag framework can substantially improve the robustness of DL-based detectors.
翻译:在源代码中自动发现软件脆弱性是一个引起人们极大关注的重要问题。 特别是,基于深层次学习的脆弱性探测器或基于DL的探测器具有吸引力,因为他们不需要人类专家来界定脆弱性的特点或模式。 但是,这种探测器的坚固性还不清楚。 在本文件中,我们发起这方面的研究,通过证明基于DL的探测器对简单的代码转换不强力,本文中被称为攻击,因为这些转换可能被用于恶意目的。作为使基于DL的探测器对此类攻击具有强大性的第一步,我们提议了一个称为ZigZag的创新框架,其中心是(一) 脱钩特征学习和分类学习,(二) 使用ZigZag式战略来迭接地改进它们,直到它们与牢固的特性和强有力的分类器汇合。 实验结果表明,ZigZag框架可以大大改善基于DL的探测器的坚固性。