The appropriate use of design patterns in code is a vital measurement of good software quality in object-oriented software applications. There exist tools to detect design pattern usage in Java source files, where their detection mechanisms have been honed through the use of supervised machine learning techniques that require large datasets of labelled files. However, manually labelling these files leads to issues such as tediousness if the team of labellers is small, and conflicting opinions between labellers, if large. Thus, we present CodeLabeller, a web-based tool which aims to provide a more efficient approach in handling the process of labelling Java source files at scale by improving the data collection process throughout, and improving the degree of reliability of responses by requiring each labeller to attach a confidence rating to each of their responses. We test CodeLabeller by constructing a corpus of over a thousand source files obtained from a large collection of open-source Java projects, and labelling each Java source file with their respective design patterns (if any), and summaries. This paper discusses the motivation behind thecreation of CodeLabeller, a demonstration of the tool and its UI, its implementation, benefits and lastly, some ideas for future improvements. A demo version of CodeLabeller can be found at: https://codelabeller.org.
翻译:适当使用代码设计图案是衡量目标导向软件应用中良好软件质量的重要尺度。现有工具可以检测爪哇源文件使用设计图案的情况,通过使用需要大量标签文件数据集的受监督的机器学习技术改进了爪哇源文件的检测机制。然而,手工贴上这些文件的标签会导致问题,如标签员团队规模小,标签员之间意见相左(如果规模大的话),那么这些图案就会引起无聊问题。因此,我们提供了代码Labeller,这是一个基于网络的工具,目的是提供一种更有效的方法,通过改进整个数据收集过程,提高答复的可靠性,要求每个标签员对每个答复都给予信任评级。我们测试代码Labell,方法是建立一个由大量公开来源的爪哇项目获得的一千多个源文件堆,并将每个爪哇源文件贴上各自的设计模式(如果有的话)和摘要。本文讨论创建代码Labeller背后的动机、工具的演示及其UI、其实施、好处和最后的响应程度,通过要求每个标签员对每个答复进行信任评分。我们测试CLabelleral 找到了一个未来改进的版本。ADrealcode可以找到一个演示版本。 。A 。A labalbalbalbalbalbolde 。A。在将来可以找到一个版本。