快速调时代码语言模型,作为统计数据制成部分代码中类型推断的神经知识库 (Prompt-tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code) - 专知论文

会员服务 ·

0

代码 · 知识 (knowledge) · 推断 · 语言模型化 · 基 ·

2022 年 8 月 26 日

Prompt-tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code

翻译：快速调时代码语言模型,作为统计数据制成部分代码中类型推断的神经知识库

Qing Huang,Zhiqiang Yuan,Zhenchang Xing,Xiwei Xu,Liming Zhu,Qinghua Lu

from arxiv, The submitted paper has been accepted by ASE 2022. If possible, please expedite the approval process. Thank you very much

Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge base of API names and code contexts, which involve significant compilation overhead and are sensitive to unseen API names and code context variations. In this paper, we formulate type inference as a cloze-style fill-in-blank language task. Built on source code naturalness, our approach fine-tunes a code masked language model (MLM) as a neural knowledge base of code elements with a novel "pre-train, prompt and predict" paradigm from raw source code. Our approach is lightweight and has minimum requirements on code compilation. Unlike existing symbolic name and context matching for type inference, our prompt-tuned code MLM packs FQN syntax and usage in its parameters and supports fuzzy neural type inference. We systematically evaluate our approach on a large amount of source code from GitHub and Stack Overflow. Our results confirm the effectiveness of our approach design and the practicality for partial code type inference. As the first of its kind, our neural type inference method opens the door to many innovative ways of using partial code.

翻译：部分代码通常包括不完全合格的类型名称(非FQNs)和未申报的接收对象。解决这些非FQN类型和未申报的接收对象的FQNs(称为类型推断)是有效搜索和重新使用部分代码的先决条件。现有的基于字典的查看方法建立了API名称和代码背景的象征性知识库,其中涉及大量编译间接费用,并且对未知的API名称和代码背景变异十分敏感。在本文中,我们将推断作为凝块式填充-空白语言任务进行输入。在源代码自然性质方面,我们的方法是微调一种代号遮掩语言模型(称为“类型推断”),作为代码要素的神经知识库(称为“类型推断、快速和预测”),这是原始源代码的新型“前置、快速和预测”模式。我们的方法是轻量的,对代码汇编有最低要求。与现有的符号名称和背景匹配,我们迅速调整的代码MLMQN合成和其参数中的使用,支持模糊性神经型自然特性,并且支持模糊的神经型语言模式模式模式模式模式模型。我们从原始设计方法中系统地评估了我们的许多设计方法。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

过渡金属元素M（M=Fe、Co、Ni）掺杂ZnMn2O4纳米晶体的合成及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

层状前驱体法构筑半导体异质结及其光激发气敏性质

国家自然科学基金

0+阅读 · 2014年12月31日

智能化焊接机器人混杂系统特征的MLD建模和控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

货币政策多目标交互行为协调控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿复合氧化物中空微纳颗粒的控制合成与气敏性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

姿态气动耦合的高超声速飞行器分块建模及鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

高效QDSSC导向的新颖ZnO超结构控制合成、形成机理及多元协同界面设计

国家自然科学基金

0+阅读 · 2011年12月31日

外场作用下分子-金属纳米粒子复合体系的QM/ED组合计算方法及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

钛丝丝网原位合成TiC增强钢铁基复合材料研究

国家自然科学基金

0+阅读 · 2009年12月31日

ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training

Arxiv

0+阅读 · 2022年10月4日

ThinkSum: Probabilistic reasoning over sets using large language models

Arxiv

0+阅读 · 2022年10月4日

Complexity-Based Prompting for Multi-Step Reasoning

Arxiv

1+阅读 · 2022年10月3日

Neural Graphical Models

Arxiv

0+阅读 · 2022年10月2日

Prompt Tuning for Graph Neural Networks

Arxiv

0+阅读 · 2022年9月30日

Compositional Semantic Parsing with Large Language Models

Arxiv

0+阅读 · 2022年9月30日

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

Arxiv

0+阅读 · 2022年9月29日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

VIP会员

文章信息

相关主题

知识 (knowledge)

语言模型化

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training

Arxiv

0+阅读 · 2022年10月4日

ThinkSum: Probabilistic reasoning over sets using large language models

Arxiv

0+阅读 · 2022年10月4日

Complexity-Based Prompting for Multi-Step Reasoning

Arxiv

1+阅读 · 2022年10月3日

Neural Graphical Models

Arxiv

0+阅读 · 2022年10月2日

Prompt Tuning for Graph Neural Networks

Arxiv

0+阅读 · 2022年9月30日

Compositional Semantic Parsing with Large Language Models

Arxiv

0+阅读 · 2022年9月30日

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

Arxiv

0+阅读 · 2022年9月29日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

过渡金属元素M（M=Fe、Co、Ni）掺杂ZnMn2O4纳米晶体的合成及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

层状前驱体法构筑半导体异质结及其光激发气敏性质

国家自然科学基金

0+阅读 · 2014年12月31日

智能化焊接机器人混杂系统特征的MLD建模和控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

货币政策多目标交互行为协调控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿复合氧化物中空微纳颗粒的控制合成与气敏性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

姿态气动耦合的高超声速飞行器分块建模及鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

高效QDSSC导向的新颖ZnO超结构控制合成、形成机理及多元协同界面设计

国家自然科学基金

0+阅读 · 2011年12月31日

外场作用下分子-金属纳米粒子复合体系的QM/ED组合计算方法及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

钛丝丝网原位合成TiC增强钢铁基复合材料研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员