Code style reflects the choice of textual representation of source code. This study, for the first time, explores whether code style can be used to identify good programmers with a vision that recruitment process in the software industry can be improved. For analysis, solutions from Google Code Jam were selected. The study used cluster analysis to find association between good programmers and style clusters. Furthermore, supervised machine learning models were trained with stylistic features to predict good programmers. Results reveal that, although association between programmers with particular clusters could not be concluded, supervised learning models can predict good programmers.
翻译:代码样式反映了源代码的文本表达方式的选择。 本研究首次探索了代码样式是否可以用来确定好的程序设计师,其愿景是软件行业的招聘过程可以改进。为了分析,选择了谷歌代码 Jam 的解决方案。研究利用集群分析来寻找好的程序设计师和风格分类群之间的联系。此外,还用文体特征对受监督的机器学习模型进行了培训,以预测良好的程序设计师。结果显示,尽管程序设计师与特定分类群之间的联系无法完成,但受监督的学习模型可以预测良好的程序设计师。