Semantic code search is the task of retrieving a code snippet given a textual description of its functionality. Recent work has been focused on using similarity metrics between neural embeddings of text and code. However, current language models are known to struggle with longer, compositional text, and multi-step reasoning. To overcome this limitation, we propose supplementing the query sentence with a layout of its semantic structure. The semantic layout is used to break down the final reasoning decision into a series of lower-level decisions. We use a Neural Module Network architecture to implement this idea. We compare our model - NS3 (Neuro-Symbolic Semantic Search) - to a number of baselines, including state-of-the-art semantic code retrieval methods, and evaluate on two datasets - CodeSearchNet and Code Search and Question Answering. We demonstrate that our approach results in more precise code retrieval, and we study the effectiveness of our modular design when handling compositional queries.
翻译:语义代码搜索是重新获取代码片断的任务, 给其功能提供文本描述 。 最近的工作侧重于在文本和代码的神经嵌入和代码之间使用相似度量。 但是, 已知当前语言模型与较长的、 组成文本和多步推理相挣扎。 为了克服这一限制, 我们提议用语义结构的布局来补充查询句。 语义布局用于将最终推理决定分解成一系列较低层次的决定 。 我们使用神经模块网络架构来实施这个想法。 我们比较了我们的模型 - NS3( 神经- 系统搜索) - 与一系列基线, 包括最先进的语义代码检索方法, 并评估两个数据集 - 代码SearchNet 和代码搜索及回答。 我们展示了我们的方法在更精确的代码检索中的结果, 我们在处理组成查询时研究模块设计的有效性 。