Kotlin is a relatively new programming language from JetBrains: its development started in 2010 with release 1.0 done in early 2016. The Kotlin compiler, while slowly and steadily becoming more and more mature, still crashes from time to time on the more tricky input programs, not least because of the complexity of its features and their interactions. This makes it a great target for fuzzing, even the basic forms of which can find a significant number of Kotlin compiler crashes. There is a problem with fuzzing, however, closely related to the cause of the crashes: generating a random, non-trivial and semantically valid Kotlin program is hard. In this paper, we talk about type-centric compiler fuzzing in the form of type-centric enumeration, an approach inspired by skeletal program enumeration and based on a combination of generative and mutation-based fuzzing, which solves this problem by focusing on program types. After creating the skeleton program, we fill the typed holes with fragments of suitable type, created via generation and enhanced by semantic-aware mutation. We implemented this approach in our Kotlin compiler fuzzing framework called Backend Bug Finder (BBF) and did an extensive evaluation, not only testing the real-world feasibility of our approach, but also comparing it to other compiler fuzzing techniques. The results show our approach to be significantly better compared to other fuzzing approaches at generating semantically valid Kotlin programs, while creating more interesting crash-inducing inputs at the same time. We managed to find more than 50 previously unknown compiler crashes, of which 18 were considered important after their triage by the compiler team.
翻译:Kotlin 是一个来自 JetBrains 的相对较新的编程语言: 它的开发始于2010年, 于2016年初完成版本1.0 。 Kotlin 编译器虽然缓慢且稳步地越来越成熟, 但仍然不时在更棘手的输入程序上崩溃, 特别是由于其特性及其互动的复杂性。 这使得它成为模糊的伟大目标, 甚至其基本形式可以找到大量 Kotlin 编译器崩溃。 但是, 与坠毁原因密切相关的模糊问题 : 生成随机的、 非三角的和 语义上有效的 Kotlin 程序是困难的 。 在本文中, 我们不时地谈论以类型中心查点的形式进行以类型为中心的编译的编译器, 这是一种由星盘编译和以突变为基础的编译器组合方法, 这通过程序类型解决问题。 在创建了直线程序之后, 我们用更清晰的方式填补了更清晰的版本的版本, 通过生成, 并且通过 语义上的变形来强化的版本程序 。 我们用一个更清晰的编译器在之前的编程中, 比较了其他的版本, 我们的编程中的编程中的编程中的编程中的编程中的编程中的编程中的编程中, 也只是比了其他的编程中, 我们的编程中的编程中的编程中的编程中的编程中的编程中, 只是在了其他的编程的编程比了其他的编程是用法, 。