The immense amounts of source code provide ample challenges and opportunities during software development. To handle the size of code bases, developers commonly search for code, e.g., when trying to find where a particular feature is implemented or when looking for code examples to reuse. To support developers in finding relevant code, various code search engines have been proposed. This article surveys 30 years of research on code search, giving a comprehensive overview of challenges and techniques that address them. We discuss the kinds of queries that code search engines support, how to preprocess and expand queries, different techniques for indexing and retrieving code, and ways to rank and prune search results. Moreover, we describe empirical studies of code search in practice. Based on the discussion of prior work, we conclude the article with an outline of challenges and opportunities to be addressed in the future.
翻译:大量的源代码在软件开发过程中提供了大量挑战和机遇。为了处理代码库的大小,开发商通常在搜索代码,例如,在试图查找具体特征实施地点或寻找可再利用的代码示例时,通常会搜索代码。为了支持开发商寻找相关代码,提出了各种代码搜索引擎。本条款调查了30年的代码搜索研究,全面概述了挑战及应对这些挑战的技术。我们讨论了代码搜索引擎支持的询问类型、如何预处理和扩大查询、编制和检索代码的不同技术,以及排序和提取搜索结果的方法。此外,我们描述了实践中对代码搜索的经验性研究。根据对以往工作的讨论,我们最后在文章中概述了今后要应对的挑战和机遇。