Lack of experience, inadequate documentation, and sub-optimal API design frequently cause developers to make mistakes when re-using third-party implementations. Such API misuses can result in unintended behavior, performance losses, or software crashes. Therefore, current research aims to automatically detect such misuses by comparing the way a developer used an API to previously inferred patterns of the correct API usage. While research has made significant progress, these techniques have not yet been adopted in practice. In part, this is due to the lack of a process capable of seamlessly integrating with software development processes. Particularly, existing approaches do not consider how to collect relevant source code samples from which to infer patterns. In fact, an inadequate collection can cause API usage pattern miners to infer irrelevant patterns which leads to false alarms instead of finding true API misuses. In this paper, we target this problem (a) by providing a method that increases the likelihood of finding relevant and true-positive patterns concerning a given set of code changes and agnostic to a concrete static, intra-procedural mining technique and (b) by introducing a concept for just-in-time API misuse detection which analyzes changes at the time of commit. Particularly, we introduce different, lightweight code search and filtering strategies and evaluate them on two real-world API misuse datasets to determine their usefulness in finding relevant intra-procedural API usage patterns. Our main results are (1) commit-based search with subsequent filtering effectively decreases the amount of code to be analyzed, (2) in particular method-level filtering is superior to file-level filtering, (3) project-internal and project-external code search find solutions for different types of misuses and thus are complementary, (4) incorporating prior knowledge of the misused [...]
翻译:缺乏经验、文件不足和低优化的API设计往往导致开发者在重新使用第三方实施时犯错误。这种API滥用可能导致意外行为、性能损失或软件崩溃。因此,目前研究的目的是将开发者使用API的方式与先前推断的正确API使用模式进行比较,从而自动发现这种滥用情况。虽然研究已取得显著进展,但这些技术在实践中尚未采用。部分原因是缺乏能够与软件开发流程无缝地整合的筛选程序。特别是,现有方法不考虑如何收集相关源代码样本,从中推断模式。事实上,收集不当可能导致API使用模式矿工产生不相关模式,从而导致错误的警报,而不是找到真实的API使用模式。在本文中,我们针对这一问题(a)提供一种方法,以便更可能找到与一套特定代码变化相关的真实和真实性模式,并且对具体静态、内部采矿技术的认知性,以及(b)通过引入一个概念,即实时搜索,对API公司使用模式进行更高程度的检索,从而对A公司内部规则进行更精确的检索,因此,对A级规则进行更精确地分析。