SWIM: 综合我指的是什么 (SWIM: Synthesizing What I Mean)

Modern programming frameworks come with large libraries, with diverse applications such as for matching regular expressions, parsing XML files and sending email. Programmers often use search engines such as Google and Bing to learn about existing APIs. In this paper, we describe SWIM, a tool which suggests code snippets given API-related natural language queries such as "generate md5 hash code". We translate user queries into the APIs of interest using clickthrough data from the Bing search engine. Then, based on patterns learned from open-source code repositories, we synthesize idiomatic code describing the use of these APIs. We introduce \emph{structured call sequences} to capture API-usage patterns. Structured call sequences are a generalized form of method call sequences, with if-branches and while-loops to represent conditional and repeated API usage patterns, and are simple to extract and amenable to synthesis. We evaluated SWIM with 30 common C# API-related queries received by Bing. For 70% of the queries, the first suggested snippet was a relevant solution, and a relevant solution was present in the top 10 results for all benchmarked queries. The online portion of the workflow is also very responsive, at an average of 1.5 seconds per snippet.

翻译：现代编程框架包含大型图书馆, 包括匹配常规表达式、剖析 XML 文档和发送电子邮件等多种应用程序。程序员经常使用 Google 和 Bing 等搜索引擎来了解已有的 API 。在本文中, 我们描述SWIM, 这个工具在与 API 有关的自然语言查询中建议代码片断, 例如“ generate md5 hash code ” 。我们使用 Bing 搜索引擎的点击数据将用户查询转换为感兴趣的 API 。然后, 我们根据从开放源代码存储处学到的规律, 合成了描述这些 API 的使用的自然代码。我们引入了\ emph{ 结构调序来捕捉 API 使用模式。结构调用序列是一种通用的方法调用序列, 包括“ 如果- branches ” 和 “ 同时-loople” 。我们用 Bing 接收的 30 个常见 C# API 查询结果来评估 Swed 。对于70%的查询来说, 第一个建议的缩略图是相关的一个相关的解决方案, 一个相关的解决方案在目前平均10 的15 级查询中有一个最符合要求的版本。