Automated program repair (APR) techniques have achieved conspicuous progress, and are now capable of producing genuinely correct fixes in scenarios that were well beyond their capabilities only a few years ago. Nevertheless, even when an APR technique can find a correct fix for a bug, it still runs the risk of ranking the fix lower than other patches that are plausible (they pass all available tests) but incorrect. This can seriously hurt the technique's practical effectiveness, as the user will have to peruse a larger number of patches before finding the correct one. This paper presents PrevaRank, a technique that ranks plausible patches produced by any APR technique according to their feature similarity with historic programmer-written fixes for similar bugs. PrevaRank implements simple heuristics, which help make it scalable and applicable to any APR tool that produces plausible patches. In our experimental evaluation, after training PrevaRank on the fix history of 81 open-source Java projects, we used it to rank patches produced by 8 Java APR tools on 168 Defects4J bugs. PrevaRank consistently improved the ranking of correct fixes: for example, it ranked a correct fix within the top-3 positions in 27% more cases than the original tools did. Other experimental results indicate that PrevaRank works robustly with a variety of APR tools and bugs, with negligible overhead.
翻译:暂无翻译