Software bloat is code that is packaged in an application but is actually not used and not necessary to run the application. The presence of bloat is an issue for software security, for performance, and for maintenance. In this paper, we introduce a novel technique to debloat Java bytecode through dynamic analysis, which we call trace-based debloat. We have developed JDBL, a tool that automates the collection of accurate execution traces and the debloating process. Given a Java project and a workload, JDBL generates a debloated version of the project that is syntactically correct and preserves the original behavior, modulo the workload. We evaluate JDBL by debloating 395 open-source Java libraries for a total 10M+ lines of code. Our results indicate that JDBL succeeds in debloating 62.2 % of the classes, and 20.5 % of the dependencies in the studied libraries. Meanwhile, we present the first experiment that assesses the quality of debloated libraries with respect to 1,066 clients of these libraries. We show that 957/1,001 (95.6 %) of the clients successfully compile, and 229/283 (80.9 %) clients can successfully run their test suite, after the drastic code removal among their libraries.
翻译:软件 bloat 是应用程序中包装的代码, 但实际上没有使用, 也没有必要运行应用程序 。 bloat 的存在是一个软件安全、 性能和维护的问题 。 在本文中, 我们引入了一种通过动态分析( 我们称之为基于追踪的脱线) 来拆解 Java 字节码的新技术 。 我们开发了 JDBL 工具, 这个工具可以自动收集准确的执行痕迹和拆解过程 。 鉴于一个 Java 项目和工作量, JDBL 生成了一个拆解版的项目, 它在方法上是正确的, 保存了原始行为, 并调节了工作量 。 我们通过拆解395 开的开放源 Java 库来评估 JDBL 总共10M+ 代码行 。 我们的结果表明, JDBL 成功拆解了62% 和 20.5% 。 同时, 我们展示了第一个实验, 评估拆解图书馆质量的图书馆质量与 1 066 客户的关系, 并保存了原始行为, 调节了工作量。 我们展示了 957/ 1 客户 的9.1% 测试了 。