高效语义码搜索快速和慢速集成模型 (Cascaded Fast and Slow Models for Efficient Semantic Code Search) - 专知论文

会员服务 ·

0

FAST · 级联 · MoDELS · state-of-the-art · 可约的 ·

2021 年 10 月 15 日

Cascaded Fast and Slow Models for Efficient Semantic Code Search

翻译：高效语义码搜索快速和慢速集成模型

Akhilesh Deepak Gotmare,Junnan Li,Shafiq Joty,Steven C. H. Hoi

from arxiv, 12 pages

The goal of natural language semantic code search is to retrieve a semantically relevant code snippet from a fixed set of candidates using a natural language query. Existing approaches are neither effective nor efficient enough towards a practical semantic code search system. In this paper, we propose an efficient and accurate semantic code search framework with cascaded fast and slow models, in which a fast transformer encoder model is learned to optimize a scalable index for fast retrieval followed by learning a slow classification-based re-ranking model to improve the performance of the top K results from the fast retrieval. To further reduce the high memory cost of deploying two separate models in practice, we propose to jointly train the fast and slow model based on a single transformer encoder with shared parameters. The proposed cascaded approach is not only efficient and scalable, but also achieves state-of-the-art results with an average mean reciprocal ranking (MRR) score of 0.7795 (across 6 programming languages) as opposed to the previous state-of-the-art result of 0.713 MRR on the CodeSearchNet benchmark.

翻译：自然语言语义代码搜索的目标是从一组使用自然语言查询的固定候选人中取回一个具有语义相关性的代码片断。现有方法对于一个实用的语义代码搜索系统既不够有效,效率也不够。在本文中,我们提出一个高效和准确的语义代码搜索框架,使用级联快速和慢速模型,学习一个快速变压器编码模型,优化快速检索的可缩放指数,随后学习一个缓慢的基于分类的重排模型,以改进快速检索的顶级K结果的性能。为了进一步降低在实践中部署两个独立模型的高记忆成本,我们提议根据具有共享参数的单一变压器编码器联合培训快速和慢式模型。拟议的串联方法不仅高效和可扩展,而且实现了最新的结果,平均对等分数为0.7795(跨6个编程语言),而代号码SearchNet基准的0.713 MRR 。

0

相关内容

FAST

FAST：Conference on File and Storage Technologies。 Explanation：文件和存储技术会议。 Publisher：USENIX。 SIT:http://dblp.uni-trier.de/db/conf/fast/

【NeurIPS2021】基于贝叶斯优化的图分类对抗攻击

【NeurIPS2021】基于贝叶斯优化的图分类对抗攻击

专知会员服务

18+阅读 · 2021年11月6日

【ICML2021】轻量级结构多样化的网络结构

专知会员服务

28+阅读 · 2021年8月2日

【图神经网络多模态检索】Multi-Modal Retrieval using Graph Neural Networks

【图神经网络多模态检索】Multi-Modal Retrieval using Graph Neural Networks

专知会员服务

30+阅读 · 2020年10月9日

【NeurIPS2020】点针图网络，Pointer Graph Networks

【NeurIPS2020】点针图网络，Pointer Graph Networks

专知会员服务

40+阅读 · 2020年9月27日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【综述论文】A Survey on Dynamic Network Embedding，动态网络嵌入综述论文

【综述论文】A Survey on Dynamic Network Embedding，动态网络嵌入综述论文

专知会员服务

101+阅读 · 2020年6月16日

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

专知会员服务

31+阅读 · 2020年5月20日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

开放知识图谱

5+阅读 · 2019年4月16日

计算机 | ISMAR 2019等国际会议信息8条

计算机 | ISMAR 2019等国际会议信息8条

Call4Papers

3+阅读 · 2019年3月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

Cascade R-CNN 论文笔记

Cascade R-CNN 论文笔记

统计学习与视觉计算组

8+阅读 · 2018年6月28日

Faster R-CNN

数据挖掘入门与实战

4+阅读 · 2018年4月20日

BranchOut: Regularization for Online Ensemble Tracking with CNN

BranchOut: Regularization for Online Ensemble Tracking with CNN

统计学习与视觉计算组

9+阅读 · 2017年10月7日

Fully Attentional Network for Semantic Segmentation

Arxiv

0+阅读 · 2021年12月8日

Label-Efficient Semantic Segmentation with Diffusion Models

Arxiv

1+阅读 · 2021年12月6日

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Arxiv

0+阅读 · 2021年12月6日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Few-shot Neural Architecture Search

Arxiv

8+阅读 · 2020年6月15日

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Arxiv

4+阅读 · 2019年7月4日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Semantic Binary Segmentation using Convolutional Networks without Decoders

Arxiv

8+阅读 · 2018年5月1日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Fast Interactive Image Retrieval using large-scale unlabeled data

Arxiv

4+阅读 · 2018年2月12日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【NeurIPS2021】基于贝叶斯优化的图分类对抗攻击

【NeurIPS2021】基于贝叶斯优化的图分类对抗攻击

专知会员服务

18+阅读 · 2021年11月6日

【ICML2021】轻量级结构多样化的网络结构

专知会员服务

28+阅读 · 2021年8月2日

【图神经网络多模态检索】Multi-Modal Retrieval using Graph Neural Networks

【图神经网络多模态检索】Multi-Modal Retrieval using Graph Neural Networks

专知会员服务

30+阅读 · 2020年10月9日

【NeurIPS2020】点针图网络，Pointer Graph Networks

【NeurIPS2020】点针图网络，Pointer Graph Networks

专知会员服务

40+阅读 · 2020年9月27日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【综述论文】A Survey on Dynamic Network Embedding，动态网络嵌入综述论文

【综述论文】A Survey on Dynamic Network Embedding，动态网络嵌入综述论文

专知会员服务

101+阅读 · 2020年6月16日

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

专知会员服务

31+阅读 · 2020年5月20日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

开放知识图谱

5+阅读 · 2019年4月16日

计算机 | ISMAR 2019等国际会议信息8条

计算机 | ISMAR 2019等国际会议信息8条

Call4Papers

3+阅读 · 2019年3月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

Cascade R-CNN 论文笔记

Cascade R-CNN 论文笔记

统计学习与视觉计算组

8+阅读 · 2018年6月28日

Faster R-CNN

数据挖掘入门与实战

4+阅读 · 2018年4月20日

BranchOut: Regularization for Online Ensemble Tracking with CNN

BranchOut: Regularization for Online Ensemble Tracking with CNN

统计学习与视觉计算组

9+阅读 · 2017年10月7日

相关论文

Fully Attentional Network for Semantic Segmentation

Arxiv

0+阅读 · 2021年12月8日

Label-Efficient Semantic Segmentation with Diffusion Models

Arxiv

1+阅读 · 2021年12月6日

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Arxiv

0+阅读 · 2021年12月6日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Few-shot Neural Architecture Search

Arxiv

8+阅读 · 2020年6月15日

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Arxiv

4+阅读 · 2019年7月4日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Semantic Binary Segmentation using Convolutional Networks without Decoders

Arxiv

8+阅读 · 2018年5月1日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Fast Interactive Image Retrieval using large-scale unlabeled data

Arxiv

4+阅读 · 2018年2月12日

微信扫码咨询专知VIP会员