源码搜索中自动查询调整的系统文献审查 (A Systematic Literature Review of Automated Query Reformulations in Source Code Search)

Software developers often fix critical bugs to ensure the reliability of their software. They might also need to add new features to their software at a regular interval to stay competitive in the market. These bugs and features are reported as change requests (i.e., technical documents written by software users). Developers consult these documents to implement the required changes in the software code. As a part of change implementation, they often choose a few important keywords from a change request as an ad hoc query. Then they execute the query with a code search engine (e.g., Lucene) and attempt to find out the exact locations within the software code that need to be changed. Unfortunately, even experienced developers often fail to choose the right queries. As a consequence, the developers often experience difficulties in detecting the appropriate locations within the code and spend the majority of their time in numerous trials and errors. There have been many studies that attempt to support developers in constructing queries by automatically reformulating their ad hoc queries. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis using the Grounded Theory approach, and then answer six important research questions. Our investigation has reported several major findings. First, to date, eight major methodologies (e.g., term weighting, query-term co-occurrence analysis, thesaurus lookup) have been adopted in query reformulation. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, weak evaluation, the extra burden on the developers) that might prevent their wide adoption. Finally, we discuss several open issues in search query reformulations and suggest multiple future research opportunities.

翻译：软件开发者经常修补关键的错误, 以确保软件的可靠性。他们也许还需要定期在软件中添加新功能, 以便保持市场竞争力。这些错误和特征被报告为更改请求( 软件用户编写的技术文件)。开发者会查阅这些文件以实施软件代码中所需的修改。作为修改执行工作的一部分, 他们经常从更改请求中选择一些重要关键字, 将其作为临时查询。然后他们用代码搜索引擎( 如 Lusene) 执行查询, 并尝试在软件代码中找到需要更改的确切位置。不幸的是, 即使是经验薄弱的开发者也往往无法选择正确的查询。结果, 开发者往往在发现代码中的适当位置时遇到困难, 并且花费大部分时间进行软件代码代码的修改。作为修改执行工作的一部分, 他们经常从修改请求中选择一些重要的关键关键字句, 作为临时查询的一部分。在系统化的文献审查中, 我们从2, 970 候选人研究中仔细选择了70项关于重新校正定义的原始研究, 进行深入的定性分析, 使用地面理论方法, 并随后回答六种重要的研究。我们的后期研究, 最终研究, 。我们的研究可能会。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日