xCodeEval:用于理解、生成、翻译和检索守则的大型多语言多任务任务基准</s> (xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval)

The ability to solve problems is a hallmark of intelligence and has been an enduring goal in AI. AI systems that can create programs as solutions to problems or assist developers in writing programs can increase productivity and make programming more accessible. Recently, pre-trained large language models have shown impressive abilities in generating new codes from natural language descriptions, repairing buggy codes, translating codes between languages, and retrieving relevant code segments. However, the evaluation of these models has often been performed in a scattered way on only one or two specific tasks, in a few languages, at a partial granularity (e.g., function) level and in many cases without proper training data. Even more concerning is that in most cases the evaluation of generated codes has been done in terms of mere lexical overlap rather than actual execution whereas semantic similarity (or equivalence) of two code segments depends only on their ``execution similarity'', i.e., being able to get the same output for a given input.

翻译：解决问题的能力是情报的标志,也是AI的一个持久目标。 AI系统可以创建方案,作为解决问题的办法,或者协助开发者编写程序,可以提高生产力,使编程更加容易。最近,经过预先培训的大型语言模型在通过自然语言描述、修复错误代码、翻译语言之间的代码和检索相关代码部分来生成新代码方面表现出惊人的能力。然而,对这些模型的评价往往分散在仅仅一项或两项具体任务上,以几种语言、部分颗粒度(例如功能)水平和许多情况下没有适当的培训数据。更重要的是,在多数情况下,对生成的代码的评价仅以单词法重叠而不是实际执行方式进行,而两个代码部分的语义相似性(或等同性)仅取决于其“执行相似性 ”, 即能够为某项输入获得相同的输出。</s>

相关内容

CASES

关注 4

CASES：International Conference on Compilers, Architectures, and Synthesis for Embedded Systems。 Explanation：嵌入式系统编译器、体系结构和综合国际会议。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/cases/index.html

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日