The Bidirectional Encoder Representations from Transformers (BERT) were proposed in the natural language process (NLP) and shows promising results. Recently researchers applied the BERT to source-code representation learning and reported some good news on several downstream tasks. However, in this paper, we illustrated that current methods cannot effectively understand the logic of source codes. The representation of source code heavily relies on the programmer-defined variable and function names. We design and implement a set of experiments to demonstrate our conjecture and provide some insights for future works.
翻译:在自然语言过程中提出了来自变换器的双向编码说明,并显示出有希望的结果。最近研究人员将BERT应用于源代码说明学习,并就一些下游任务报告了一些好消息。然而,在本文件中,我们指出,目前的方法无法有效地理解源代码的逻辑。源代码的表述在很大程度上依赖于程序员定义的变量和函数名称。我们设计并实施了一套实验,以展示我们的推测,并为未来的工程提供一些见解。