Most static program analyses depend on Call Graphs (CGs), including reachability of security vulnerabilities. Static CGs ensure soundness through over-approximation, which results in inflated sizes and imprecision. Recent research has employed machine learning (ML) models to prune false edges and enhance CG precision. However, these models require real-world programs with high test coverage to generalize effectively and the inference is expensive. In this paper, we present OriginPruner, a novel call graph pruning technique that leverages the method origin, which is where a method signature is first introduced within a class hierarchy. By incorporating insights from a localness analysis that investigated the scope of method interactions into our approach, OriginPruner confidently identifies and prunes edges related to these origin methods. Our key findings reveal that (1) dominant origin methods, such as Iterator.next, significantly impact CG sizes; (2) derivatives of these origin methods are primarily local, enabling safe pruning without affecting downstream inter-procedural analyses; (3) OriginPruner achieves a significant reduction in CG size while maintaining the soundness of CGs for security applications like vulnerability propagation analysis; and (4) OriginPruner introduces minimal computational overhead. These findings underscore the potential of leveraging domain knowledge about the type system for more effective CG pruning, offering a promising direction for future work in static program analysis.
翻译:暂无翻译