Despite extensive testing and correctness certification of their functional semantics, a number of compiler optimizations have been shown to violate security guarantees implemented in source code. While prior work has shed light on how such optimizations may introduce semantic security weaknesses into programs, there remains a significant knowledge gap concerning the impacts of compiler optimizations on non-semantic properties with security implications. In particular, little is currently known about how code generation and optimization decisions made by the compiler affect the availability and utility of reusable code segments called gadgets required for implementing code reuse attack methods such as return-oriented programming. In this paper, we bridge this gap through a study of the impacts of compiler optimization on code reuse gadget populations. We analyze and compare 1,187 variants of 20 different benchmark programs built with two production compilers (GCC and Clang) to determine how compiler optimization behaviors affect the code reuse gadget sets present in program variants with respect to both quantitative and qualitative metrics. Our study exposes an important and unexpected problem; compiler optimizations introduce new gadgets at a high rate and produce optimized code containing gadget sets that are generally more useful to an attacker than those in unoptimized code. Using differential binary analysis, we identify several undesirable optimization behaviors at the root of this phenomenon. In turn, we propose and evaluate several strategies to mitigate these behaviors. In particular, we show that post-production binary recompilation passes can effectively mitigate these behaviors with negligible performance impacts, resulting in optimized code with significantly smaller and less useful gadget sets.
翻译:尽管对其功能语义进行了广泛的测试和校正认证,但一些编译器优化显示违反了源代码中实施的安全保障。虽然先前的工作揭示了这种优化如何将语义安全弱点引入程序,但对于编译器优化对具有安全影响的非语义属性的影响,仍然存在巨大的知识差距。特别是,目前对编译器的代码生成和优化决定如何影响可重新使用的代码元件的可用性和实用性知之甚少,这些元件被称为“工具”,用于实施代码再利用袭击方法,如以返回为导向的编程。在本文中,我们通过研究编译器优化对代码再利用工具组人群的影响来弥补这一差距。我们分析并比较了由两个生产编译器(GCC和Clanc)组成的20个不同基准程序对非语义属性的影响,以确定编译器优化行为如何影响这些定量和定性计量软件中的代码组合。我们的研究揭示了一个重要和意外的问题;编译器优化了新的工具,以非高速率引入了新的拼写器,并生成了含有不甚甚精确的版本的代码,从而大幅地展示了我们更有用的原始的版本的代码,从而展示了这些原始的版本的代码,从而展示了我们更精确地展示了各种的版本。我们用了一些的版本,我们更有用的代码,我们用了一些更有用的代码,在演示中展示了更精确地展示了一些的版本中展示了一些的代码,我们更有用的代码,在演示了更精确地展示了更精确的版本中展示了更有用的代码。