We consider repair tasks: given a critic (e.g., compiler) that assesses the quality of an input, the goal is to train a fixer that converts a bad example (e.g., code with syntax errors) into a good one (e.g., code with no errors). Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). However, fixers trained on this synthetically-generated data do not extrapolate well to the real distribution of bad inputs. To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data. We evaluate BIFI on two code repair datasets: GitHub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and DeepFix, where the goal is to repair C code with compiler errors. BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python (+28.5%) and 71.7% on DeepFix (+5.6%). Notably, BIFI does not require any labeled data; we hope it will be a strong starting point for unsupervised learning of various repair tasks.
翻译:我们考虑修理任务: 给一个评估输入质量的批评者( 例如, 编译者), 目标是训练一个将错误示例( 例如, 带有语法错误的代码) 转换成好范例( 例如, 代码没有错误) 的修补者。 现有的作品通过使用超光速( 例如, 丢弃标记) 来腐蚀好例子来创建由( 坏的) 配对构成的培训数据。 然而, 有关这一合成数据的培训修补者不会很好地推断出错误的准确性。 为了弥补这一差距, 我们提议了一个新的训练方法, 将一个坏的示例( 例如, 代码代码编码编码编码编码) 转换为坏的错误( 例如) 。 根据这些想法, 我们反复地更新了断分解器和修正器, 并同时生成更多配对的数据。 我们用 BIFP 5 开始在两个目标上进行修补 BIFOOOODODODDOLA 。