Unknowingly, identifiers in the source code of a software system play a vital role in determining the quality of the system. Ambiguous and confusing identifier names lead developers to not only misunderstand the behavior of the code but also increases comprehension time and thereby causes a loss in productivity. Even though correcting poor names through rename operations is a viable option for solving this problem, renaming itself is an act of rework and is not immune to defect injection. In this study, we aim to understand the motivations that drive developers to name and rename identifiers and the decisions they make in determining the name. Using our results, we propose the development of a linguistic model that determines identifier names based on the behavior of the identifier. As a prerequisite to constructing the model, we conduct multiple studies to determine the features that should feed into the model. In this paper, we discuss findings from our completed studies and justify the continuation of research on this topic through further studies.
翻译:在不知情的情况下,软件系统源代码中的识别信息在确定系统质量方面发挥着关键作用。 模糊和混淆的识别信息名导致开发者不仅误解代码的行为,而且增加理解时间,从而造成生产力损失。 尽管通过重命名操作纠正不好的名称是解决这一问题的一个可行选择,但重命名本身是一种重写行为,不能避免输入缺陷。 在这项研究中,我们的目标是理解驱动开发者命名和重命名识别信息以及他们在确定名称时所作决定的动机。 我们建议开发一种语言模型,根据识别信息的行为确定识别信息名。作为构建模型的先决条件,我们进行多项研究,以确定该模型应包含的特征。我们在本文件中讨论我们已完成的研究的结果,并通过进一步研究来证明继续研究这一专题的理由。