Software development comprises the use of multiple Third-Party Libraries (TPLs). However, the irrelevant libraries present in software application's distributable often lead to excessive consumption of resources such as CPU cycles, memory, and modile-devices' battery usage. Therefore, the identification and removal of unused TPLs present in an application are desirable. We present a rapid, storage-efficient, obfuscation-resilient method to detect the irrelevant-TPLs in Java and Python applications. Our approach's novel aspects are i) Computing a vector representation of a .class file using a model that we call Lib2Vec. The Lib2Vec model is trained using the Paragraph Vector Algorithm. ii) Before using it for training the Lib2Vec models, a .class file is converted to a normalized form via semantics-preserving transformations. iii) A eXtra Library Detector (XtraLibD) developed and tested with 27 different language-specific Lib2Vec models. These models were trained using different parameters and >30,000 .class and >478,000 .py files taken from >100 different Java libraries and 43,711 Python available at MavenCentral.com and Pypi.com, respectively. XtraLibD achieves an accuracy of 99.48% with an F1 score of 0.968 and outperforms the existing tools, viz., LibScout, LiteRadar, and LibD with an accuracy improvement of 74.5%, 30.33%, and 14.1%, respectively. Compared with LibD, XtraLibD achieves a response time improvement of 61.37% and a storage reduction of 87.93% (99.85% over JIngredient). Our program artifacts are available at https://www.doi.org/10.5281/zenodo.5179747.
翻译:软件开发包括多个第三方图书馆(TPLs) 。 然而, 软件应用程序分配模式中存在的不相关的图书馆往往导致过度消耗资源, 如 CPU 周期、 内存和 modile- devits 电池的使用。 因此, 在应用程序中使用未使用的TPL 之前, 需要识别和删除。 我们提出了一个快速、 储存高效、 模糊的修复方法, 以探测 Java 和 Python 应用程序中的不相关的TPL。 我们的方法是 i) 使用我们称之为 Lib2Vec 的模式计算一个. 类文件的矢量代表 。 Lib2Vec 版本往往会用 Vacctor Algorithm. ii 进行训练。 在使用它来培训 Lib2Vec 模型之前, 一个. 类文件被转换为通过 semanmenttical- preservices 。 iii) eXtra 图书馆检测器( XtraLibrefor) 和测试了27种特定语言的 Lib2Vc 格式的改进版本。 。 这些模型已经分别使用不同的参数和391 massal_40% disldal 和40ors 和40ors 和40%