As a rich source of data, Call Graphs are used for various applications including security vulnerability detection. Despite multiple studies showing that Call Graphs can drastically improve the accuracy of analysis, existing ecosystem-scale tools like Dependabot do not use Call Graphs and work at the package-level. Using Call Graphs in ecosystem use cases is not practical because of the scalability problems that Call Graph generators have. Call Graph generation is usually considered to be a "full program analysis" resulting in large Call Graphs and expensive computation. To make an analysis applicable to ecosystem scale, this pragmatic approach does not work, because the number of possible combinations of how a particular artifact can be combined in a full program explodes. Therefore, it is necessary to make the analysis incremental. There are existing studies on different types of incremental program analysis. However, none of them focuses on Call Graph generation for an entire ecosystem. In this paper, we propose an incremental implementation of the CHA algorithm that can generate Call Graphs on-demand, by stitching together partial Call Graphs that have been extracted for libraries before. Our preliminary evaluation results show that the proposed approach scales well and outperforms the most scalable existing framework called OPAL.
翻译:作为丰富的数据来源, Call Graps 被用于各种应用, 包括安全脆弱性检测。 尽管多项研究表明Call Graps 能够大幅提高分析的准确性, 但现有的生态系统尺度工具, 如 Dependabot, 并不使用 Call Graps, 也不在软件包一级工作。 在生态系统使用案例中使用 Call 图形并不实际, 因为Call Graps 具有可缩放性的问题。 调用图的生成通常被认为是一个“ 完整的程序分析”, 导致大调用图和昂贵的计算。 要对生态系统规模进行一项分析, 这种务实的方法是行不通的, 因为Call Graps 可能将特定文物合并到一个完整的程序爆炸中。 因此, 有必要使分析渐进化。 但是, 在各种递增程序分析中, 没有一项研究侧重于整个生态系统的 Call Grap 生成。 在本文中, 我们建议逐步实施 CHA 算法, 以产生点燃的调用图表,, 将以前为图书馆提取的部分调制的部分调制的Call 图表结合起来。 我们的初步评价结果显示, 拟议的方法比例优于现有可缩写为OPAL 。