GitHub Copilot, an extension for the Visual Studio Code development environment powered by the large-scale language model Codex, makes automatic program synthesis available for software developers. This model has been extensively studied in the field of deep learning, however, a comparison to genetic programming, which is also known for its performance in automatic program synthesis, has not yet been carried out. In this paper, we evaluate GitHub Copilot on standard program synthesis benchmark problems and compare the achieved results with those from the genetic programming literature. In addition, we discuss the performance of both approaches. We find that the performance of the two approaches on the benchmark problems is quite similar, however, in comparison to GitHub Copilot, the program synthesis approaches based on genetic programming are not yet mature enough to support programmers in practical software development. Genetic programming usually needs a huge amount of expensive hand-labeled training cases and takes too much time to generate solutions. Furthermore, source code generated by genetic programming approaches is often bloated and difficult to understand. For future work on program synthesis with genetic programming, we suggest researchers to focus on improving the execution time, readability, and usability.
翻译:GitHub Copilot是由大规模语言模式代码代码模型推动的视觉工作室开发环境的延伸,它为软件开发者提供了自动程序合成。这个模型在深层学习领域得到了广泛的研究,然而,与遗传方案编程的比较尚未进行,而遗传方案编程也因其在自动程序合成方面的性能而闻名。在这份文件中,我们对标准方案综合基准问题进行了评估,并将所取得的成果与遗传方案文献的结果进行比较。此外,我们讨论了这两种方法的绩效。然而,我们发现,与GitHub Copilot相比,这两种基准问题方法的绩效非常相似。基于遗传方案编程的方案合成方法尚不够成熟,不足以支持程序员实际软件开发。遗传方案编程通常需要大量昂贵的手工标签培训案例,并花费了太多的时间来产生解决方案。此外,基因方案编程方法产生的源代码往往变得模糊不清和难以理解。关于基因方案编程的未来工作,我们建议研究人员把重点放在改进执行时间、可读性和可用性上。