Tools focused on cryptographic API misuse often detect the most basic expressions of the vulnerable use, and are unable to detect non-trivial variants. The question of whether tools should be designed to detect such variants can only be answered if we know how developers use and misuse cryptographic APIs in the wild, and in particular, what the unnatural usage of such APIs looks like. This paper presents the first large-scale study that characterizes unnatural crypto-API usage through a qualitative analysis of 5,704 representative API invocations. We develop an intuitive complexity metric to stratify 140,431 crypto-API invocations obtained from 20,508 Android applications, allowing us to sample 5,704 invocations that are representative of all strata, with each stratum consisting of invocations with similar complexity/naturalness. We qualitatively analyze the 5,704 sampled invocations using manual reverse engineering, through an in-depth investigation that involves the development of minimal examples and exploration of native code. Our study results in two detailed taxonomies of unnatural crypto-API misuse, along with 17 key findings that show the presence of highly unusual misuse, evasive code, and the inability of popular tools to reason about even mildly unconventional usage. Our findings lead to four key takeaways that inform future work focused on detecting unnatural crypto-API misuse.
翻译:暂无翻译