Jigsaw:大语言模式符合方案综合 (Jigsaw: Large Language Models meet Program Synthesis) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 可理解性 · Automator · Pair ·

2021 年 12 月 6 日

Jigsaw: Large Language Models meet Program Synthesis

翻译：Jigsaw:大语言模式符合方案综合

Naman Jain,Skanda Vaidyanath,Arun Iyer,Nagarajan Natarajan,Suresh Parthasarathy,Sriram Rajamani,Rahul Sharma

from arxiv, Accepted to ICSE'22

Large pre-trained language models such as GPT-3, Codex, and Google's language model are now capable of generating code from natural language specifications of programmer intent. We view these developments with a mixture of optimism and caution. On the optimistic side, such large language models have the potential to improve productivity by providing an automated AI pair programmer for every programmer in the world. On the cautionary side, since these large language models do not understand program semantics, they offer no guarantees about quality of the suggested code. In this paper, we present an approach to augment these large language models with post-processing steps based on program analysis and synthesis techniques, that understand the syntax and semantics of programs. Further, we show that such techniques can make use of user feedback and improve with usage. We present our experiences from building and evaluating such a tool jigsaw, targeted at synthesizing code for using Python Pandas API using multi-modal inputs. Our experience suggests that as these large language models evolve for synthesizing code from intent, jigsaw has an important role to play in improving the accuracy of the systems.

翻译：GPT-3、Codex和Google语言模型等经过预先培训的大型语言模型现在能够根据程序员意图的自然语言规格生成代码。我们以乐观和谨慎的混合方式看待这些动态。在乐观的一面,这些大型语言模型有可能通过为世界上每个程序员提供自动的AI对配程序程序员来提高生产率。在谨慎的一面,由于这些大型语言模型不理解程序语义学,因此无法保证所建议代码的质量。在本文中,我们提出了一个方法,用基于程序分析和合成技术的后处理步骤来增强这些大型语言模型,这些步骤能够理解程序的语义学和语义学。此外,我们表明,这些技术能够利用用户反馈并改进程序的使用。我们介绍了我们从建造和评价这种工具支格锯的经验,目标是利用多模式投入合成使用Python Pandas API的代码。我们的经验表明,随着这些大型语言模型从意图中合成代码的演进,jigsaw在提高系统的准确性方面起着重要的作用。

0

相关内容

语言模型化

语言模型化

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

学术报告|港科大助理教授宋阳秋博士

学术报告|港科大助理教授宋阳秋博士

科技创新与创业

7+阅读 · 2019年7月19日

已删除

将门创投

3+阅读 · 2019年1月29日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Typical Decoding for Natural Language Generation

Arxiv

1+阅读 · 2022年2月10日

Black-Box Tuning for Language-Model-as-a-Service

Arxiv

0+阅读 · 2022年2月8日

Can We Generate Shellcodes via Natural Language? An Empirical Study

Arxiv

0+阅读 · 2022年2月8日

Semantic features of object concepts generated with GPT-3

Arxiv

0+阅读 · 2022年2月8日

A Survey of Generalisation in Deep Reinforcement Learning

Arxiv

4+阅读 · 2021年11月18日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

Arxiv

7+阅读 · 2019年10月8日

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Arxiv

3+阅读 · 2019年9月10日

Improv Chat: Second Response Generation for Chatbot

Arxiv

4+阅读 · 2018年5月10日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

视觉-语言-动作模型解析：从模块构成到里程碑与挑战

《解析陆域作战方向：一个概念性框架》报告

【博士论文】基于多模态基础模型的上下文学习

追寻真正的AI自主性：从遗留思维到战场优势

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

学术报告|港科大助理教授宋阳秋博士

学术报告|港科大助理教授宋阳秋博士

科技创新与创业

7+阅读 · 2019年7月19日

已删除

将门创投

3+阅读 · 2019年1月29日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Typical Decoding for Natural Language Generation

Arxiv

1+阅读 · 2022年2月10日

Black-Box Tuning for Language-Model-as-a-Service

Arxiv

0+阅读 · 2022年2月8日

Can We Generate Shellcodes via Natural Language? An Empirical Study

Arxiv

0+阅读 · 2022年2月8日

Semantic features of object concepts generated with GPT-3

Arxiv

0+阅读 · 2022年2月8日

A Survey of Generalisation in Deep Reinforcement Learning

Arxiv

4+阅读 · 2021年11月18日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

Arxiv

7+阅读 · 2019年10月8日

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Arxiv

3+阅读 · 2019年9月10日

Improv Chat: Second Response Generation for Chatbot

Arxiv

4+阅读 · 2018年5月10日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员