We consider repair tasks: given a critic (e.g., compiler) that assesses the quality of an input, the goal is to train a fixer that converts a bad example (e.g., code with syntax errors) into a good one (e.g., code with no syntax errors). Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). However, fixers trained on this synthetically-generated data do not extrapolate well to the real distribution of bad inputs. To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data. We evaluate BIFI on two code repair datasets: GitHub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and DeepFix, where the goal is to repair C code with compiler errors. BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python (+28.5%) and 71.7% on DeepFix (+5.6%). Notably, BIFI does not require any labeled data; we hope it will be a strong starting point for unsupervised learning of various repair tasks.
翻译:我们考虑修理任务: 给一个评估输入质量的批评者( 例如, 编译者), 目标是训练一个将错误示例( 例如, 带有语法错误的代码) 转换成好范例( 例如, 代码没有语法错误 ) 的修补者。 现有的作品通过使用超光速( 例如, 丢弃符号) 来腐蚀好范例来创建由( 坏、 好) 配对构成的培训数据。 然而, 以这一合成生成的数据训练的修补者不会很好地推断出错误输入的真实分布。 为了弥补这一差距, 我们提议一个新的培训方法, 将一个错误( 例如, 代码代码) 转换为错误( Break- I), 它有两个关键概念:( 一) 我们使用评论器来检查一个错误输入的修补器输出结果, 添加好( 九) 输出到培训断器, 从好代码中生成现实的坏代码。 基于这些想法, 我们反复更新断器和修正器, 并同时生成更精确的数据 。