In Grammatical Error Correction (GEC), sequence labeling models enjoy fast inference compared to sequence-to-sequence models; however, inference in sequence labeling GEC models is an iterative process, as sentences are passed to the model for multiple rounds of correction, which exposes the model to sentences with progressively fewer errors at each round. Traditional GEC models learn from sentences with fixed error rates. Coupling this with the iterative correction process causes a mismatch between training and inference that affects final performance. In order to address this mismatch, we propose a GAN-like sequence labeling model, which consists of a grammatical error detector as a discriminator and a grammatical error labeler with Gumbel-Softmax sampling as a generator. By sampling from real error distributions, our errors are more genuine compared to traditional synthesized GEC errors, thus alleviating the aforementioned mismatch and allowing for better training. Our results on several evaluation benchmarks demonstrate that our proposed approach is effective and improves the previous state-of-the-art baseline.
翻译:在格外错误校正(GEC)中,序列标签模型与序列到顺序模型相比具有快速的推论;然而,在标为GEC模型的序列中,推论是一个迭代过程,因为将句子传给多轮校正模型,使模型在每轮校正中发生逐渐减少错误。传统的GEC模型从判决中得出固定错误率。与迭代校正程序结合,造成培训和推论之间的不匹配,从而影响最后性能。为了解决这一不匹配问题,我们提议了一个类似GAN的序列标签模型,其中包括一个格式错误探测器,作为歧视器和一个语法错误标签器,用Gumbel-Softmax取样作为生成器。通过对真实错误分布进行取样,我们的错误与传统的合成GEC错误相比更为真实,从而缓解了上述不匹配,并允许进行更好的培训。我们几项评估基准的结果表明,我们所提议的方法是有效的,并改进了先前的状态基线。