Software bugs pose an ever-present concern for developers, and patching such bugs requires a considerable amount of costs through complex operations. In contrast, introducing bugs can be an effortless job, in that even a simple mutation can easily break the Program Under Test (PUT). Existing research has considered these two opposed activities largely separately, either trying to automatically generate realistic patches to help developers, or to find realistic bugs to simulate and prevent future defects. Despite the fundamental differences between them, however, we hypothesise that they do not syntactically differ from each other when considered simply as code changes. To examine this assumption systematically, we investigate the relationship between patches and buggy commits, both generated manually and automatically, using a clustering and pattern analysis. A large scale empirical evaluation reveals that up to 70% of patches and faults can be clustered together based on the similarity between their lexical patterns; further, 44% of the code changes can be abstracted into the identical change patterns. Moreover, we investigate whether code mutation tools can be used as Automated Program Repair (APR) tools, and APR tools as code mutation tools. In both cases, the inverted use of mutation and APR tools can perform surprisingly well, or even better, when compared to their original, intended uses. For example, 89% of patches found by SequenceR, a deep learning based APR tool, can also be found by its inversion, i.e., a model trained with faults and not patches. Similarly, real fault coupling study of mutants reveals that TBar, a template based APR tool, can generate 14% and 3% more fault couplings than traditional mutation tools, PIT and Major respectively, when used as a mutation tool.
翻译:软件错误对开发者来说是一个始终存在的担忧, 修补这些错误需要通过复杂的操作来支付相当大的成本。 相反, 引入错误可能是一种不努力的工作, 因为即使是简单的突变也可以轻易打破程序测试( PUT ) 。 现有的研究已经在很大程度上分别审议了这两种相反的活动, 要么试图自动生成现实的补丁以帮助开发者, 要么寻找现实的错误来模拟和防止未来的缺陷。 尽管它们之间存在根本的差别, 但是我们假设它们不会在仅仅被视为代码修改时, 相互之间发生突变。 相反, 我们系统地检查这一假设, 我们调查补缺和错误之间的关系, 不管是人工生成的还是自动生成的。 大规模的经验评估显示, 高达70%的补缺和错误可以基于相似的词汇模式组合在一起; 进一步, 代码更改的44%可以抽象化成相同的变化模式。 此外, 我们研究的是, 代码变异性工具是否被训练为自动程序修理工具( APR), 以及 RAr 工具作为代码变异工具, 两者之间的关系, 两者都是人工生成的, 。 在原始的 RRA 工具中, 找到了 3 工具, 也可以被复制工具, 使用到原始工具, 。</s>