评估《代码克隆检测法》检测的很少的射击和矛盾学习方法 (Evaluating few shot and Contrastive learning Methods for Code Clone Detection) - 专知论文

会员服务 ·

0

代码 · MAML · MoDELS · contrastive · 学成 ·

2022 年 4 月 15 日

Evaluating few shot and Contrastive learning Methods for Code Clone Detection

翻译：评估《代码克隆检测法》检测的很少的射击和矛盾学习方法

Mohamad Khajezade,Fatemeh Hendijani Fard,Mohamed S. Shehata

Context: Code Clone Detection (CCD) is a software engineering task that is used for plagiarism detection, code search, and code comprehension. Recently, deep learning-based models have achieved an F1 score (a metric used to assess classifiers) of $\sim$95\% on the CodeXGLUE benchmark. These models require many training data, mainly fine-tuned on Java or C++ datasets. However, no previous study evaluates the generalizability of these models where a limited amount of annotated data is available. Objective: The main objective of this research is to assess the ability of the CCD models as well as few shot learning algorithms for unseen programming problems and new languages (i.e., the model is not trained on these problems/languages). Method: We assess the generalizability of the state of the art models for CCD in few shot settings (i.e., only a few samples are available for fine-tuning) by setting three scenarios: i) unseen problems, ii) unseen languages, iii) combination of new languages and new problems. We choose three datasets of BigCloneBench, POJ-104, and CodeNet and Java, C++, and Ruby languages. Then, we employ Model Agnostic Meta-learning (MAML), where the model learns a meta-learner capable of extracting transferable knowledge from the train set; so that the model can be fine-tuned using a few samples. Finally, we combine contrastive learning with MAML to further study whether it can improve the results of MAML.

翻译：背景:代码克隆检测(CCD)是一项软件工程任务,用于进行斑象检测、代码搜索和代码理解。最近,深层次的学习模型在代码XGLUE基准基准上取得了F1分(用于评估分类的尺度)$sim$95 ⁇ 。这些模型需要许多培训数据,主要是在爪哇或C+++数据集上进行微调。然而,以往没有一项研究评估这些模型在可获得附加说明的数据数量有限的情况下的通用性。目标:本研究的主要目标是评估CCD模型的能力,以及很少为隐蔽的编程问题和新语言(即,该模型没有就这些问题/语言进行培训)。方法:我们评估CCD在几处拍摄环境中的艺术模型的可概括性(即,只有少量的样本可以进行微调),方法是设定三种模式:(一) 看不见的问题,二) 看不见的语言,三) 新的语言和新的问题组合。我们选择了三种BICMET模型(即MMANet的精细数据集, MAMLA-MLMLA的学习结果,我们使用MLA-MLA的精选的模型学习。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

有限域上指数和与量子码的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

目标跟踪中的时空上下文建模方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

多控磁性核-壳微球Fe3O4@MOFs/GO的构筑及载药性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

机器翻译中大规模异类特征的迁移学习

国家自然科学基金

2+阅读 · 2013年12月31日

基于一致性数据融合的磨机负荷自适应软测量研究

国家自然科学基金

0+阅读 · 2013年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用小鼠模型研究lrrc10与desmin在心肌肥大发生中的协同调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

MEMRI追踪骨髓源神经干细胞移植后与宿主脑组织功能重建实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

多文种文档图像识别的多层次马尔可夫随机场模型研究

国家自然科学基金

1+阅读 · 2008年12月31日

CORE: Consistent Representation Learning for Face Forgery Detection

CORE: Consistent Representation Learning for Face Forgery Detection

Arxiv

0+阅读 · 2022年6月6日

Extreme Compression for Pre-trained Transformers Made Simple and Efficient

Arxiv

0+阅读 · 2022年6月4日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Few-shot Learning with Noisy Labels

Arxiv

13+阅读 · 2022年4月12日

ContrastMask: Contrastive Learning to Segment Every Thing

Arxiv

15+阅读 · 2022年3月18日

A Survey of Deep Learning for Low-Shot Object Detection

Arxiv

21+阅读 · 2021年12月6日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

CORE: Consistent Representation Learning for Face Forgery Detection

CORE: Consistent Representation Learning for Face Forgery Detection

Arxiv

0+阅读 · 2022年6月6日

Extreme Compression for Pre-trained Transformers Made Simple and Efficient

Arxiv

0+阅读 · 2022年6月4日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Few-shot Learning with Noisy Labels

Arxiv

13+阅读 · 2022年4月12日

ContrastMask: Contrastive Learning to Segment Every Thing

Arxiv

15+阅读 · 2022年3月18日

A Survey of Deep Learning for Low-Shot Object Detection

Arxiv

21+阅读 · 2021年12月6日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

相关基金

有限域上指数和与量子码的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

目标跟踪中的时空上下文建模方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

多控磁性核-壳微球Fe3O4@MOFs/GO的构筑及载药性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

机器翻译中大规模异类特征的迁移学习

国家自然科学基金

2+阅读 · 2013年12月31日

基于一致性数据融合的磨机负荷自适应软测量研究

国家自然科学基金

0+阅读 · 2013年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用小鼠模型研究lrrc10与desmin在心肌肥大发生中的协同调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

MEMRI追踪骨髓源神经干细胞移植后与宿主脑组织功能重建实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

多文种文档图像识别的多层次马尔可夫随机场模型研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员