Modern software development reuses code by importing libraries as dependencies. Software projects typically include an average of 36 dependencies, with 80% being transitive, meaning they are dependencies of dependencies. Recent research indicates that only 24.9% of these dependencies are required at runtime, and even within those, many program constructs remain unused, adding unnecessary code to the project. This has led to the development of debloating tools that remove unnecessary dependencies and program constructs while balancing precision by eliminating unused constructs and soundness by preserving all required constructs. To systematically evaluate this trade-off, we developed Deblometer, a micro-benchmark consisting of 59 test cases designed to assess support for various Java language features in debloating tools. Each test case includes a manually curated ground truth specifying necessary and bloated classes, methods, and fields, enabling precise measurement of soundness and precision. Using Deblometer, we evaluated three popular Java debloating tools: Deptrim, JShrink, and ProGuard. Our evaluation reveals that all tools remove required program constructs, which results in changed semantics or execution crashes. In particular, the dynamic class loading feature introduces unsoundness in all evaluated tools. Our comparison shows that Deptrim retains more bloated constructs, while ProGuard removes more required constructs. JShrink's soundness is significantly affected by limited support for annotations, which leads to corrupted debloated artifacts. These soundness issues highlight the need to improve debloating tools to ensure stable and reliable debloated software.
翻译:现代软件开发通过导入库作为依赖项来实现代码复用。软件项目通常平均包含 36 个依赖项,其中 80% 为传递性依赖,即它们是依赖项的依赖项。近期研究表明,这些依赖项中仅有 24.9% 在运行时是必需的,且即使在必需依赖项中,许多程序结构仍未被使用,从而向项目添加了不必要的代码。这推动了去膨胀工具的发展,这些工具通过移除未使用的构造来平衡精确性,并通过保留所有必需构造来确保正确性,从而消除不必要的依赖项和程序结构。为系统评估这种权衡关系,我们开发了 Deblometer——一个包含 59 个测试用例的微基准套件,旨在评估去膨胀工具对 Java 语言特性的支持程度。每个测试用例都包含人工标注的基准真值,明确指定了必需类与冗余类、方法及字段,从而能够精确测量正确性与精确性。使用 Deblometer 我们评估了三种主流 Java 去膨胀工具:Deptrim、JShrink 和 ProGuard。评估结果表明,所有工具都会移除必需的程序构造,导致语义改变或执行崩溃。特别是动态类加载特性在所有被评估工具中都引入了正确性问题。我们的对比显示 Deptrim 保留了更多冗余构造,而 ProGuard 移除了更多必需构造。JShrink 对注解支持的局限性显著影响了其正确性,导致生成损坏的去膨胀产物。这些正确性问题凸显了改进去膨胀工具以确保生成稳定可靠去膨胀软件的必要性。