Software bloat is code that is packaged in an application but is actually not necessary to run the application. The presence of software bloat is an issue for security, for performance, and for maintenance. In this paper, we introduce a novel technique for debloating, which we call coverage-based debloating. We implement the technique for one single language: Java bytecode. We leverage a combination of state-of-the-art Java bytecode coverage tools to precisely capture what parts of a project and its dependencies are used when running with a specific workload. Then, we automatically remove the parts that are not covered, in order to generate a debloated version of the project. We succeed to debloat 211 library versions from a dataset of 94 unique open-source Java libraries. The debloated versions are syntactically correct and preserve their original behavior according to the workload. Our results indicate that 68.3% of the libraries' bytecode and 20.3% of their total dependencies can be removed through coverage-based debloating. For the first time in the literature on software debloating, we assess the utility of debloated libraries with respect to client applications that reuse them. We select 988 client projects that either have a direct reference to the debloated library in their source code or which test suite covers at least one class of the libraries that we debloat. Our results show that 81.5% of the clients, with at least one test that uses the library, successfully compile and pass their test suite when the original library is replaced by its debloated version.
翻译:软件 bloat 是应用程序中包装的代码, 但实际上对于运行应用程序来说并不必要 。 软件 bloat 的存在是一个安全、 性能和维护问题 。 在本文中, 我们引入了一种新的拆卸技术, 我们称之为基于覆盖拆卸。 我们使用一种单一语言: Java bytecode 的技术 。 我们使用一种最新的 Java bytecode 覆盖工具组合, 精确地捕捉项目的部分及其依赖性在运行特定工作量时使用。 然后, 我们自动删除未覆盖的部分, 以生成一个拆卸的版本 。 以生成项目的拆卸版本 。 我们从94个独有开源 Java 图书馆的数据集中成功拆卸掉 211 图书馆版本 。 拆卸版本非常正确, 根据工作量保留原始行为。 我们的结果显示, 图书馆的68. 3 % 及其全部依赖性的20. 可以通过基于覆盖的拆卸载的拆卸版本 来替换。 在文献中, 最起码的一个版本的版本是, 我们通过测试客户图书馆的版本, 选择了该版本。