Focus 是中国语校正错误所需要的焦点 (Focus Is What You Need For Chinese Grammatical Error Correction)

Chinese Grammatical Error Correction (CGEC) aims to automatically detect and correct grammatical errors contained in Chinese text. In the long term, researchers regard CGEC as a task with a certain degree of uncertainty, that is, an ungrammatical sentence may often have multiple references. However, we argue that even though this is a very reasonable hypothesis, it is too harsh for the intelligence of the mainstream models in this era. In this paper, we first discover that multiple references do not actually bring positive gains to model training. On the contrary, it is beneficial to the CGEC model if the model can pay attention to small but essential data during the training process. Furthermore, we propose a simple yet effective training strategy called OneTarget to improve the focus ability of the CGEC models and thus improve the CGEC performance. Extensive experiments and detailed analyses demonstrate the correctness of our discovery and the effectiveness of our proposed method.

翻译：中国语言错误校正(CGEC)旨在自动发现和纠正中文文本中的语法错误。从长远看,研究人员将CCC视为具有某种程度不确定性的任务,也就是说,非语法句往往有多重参考。然而,我们认为,尽管这是一个非常合理的假设,但对于当今时代的主流模型的智慧来说,它过于严厉。在本文中,我们首先发现,多处引用实际上并没有给示范培训带来积极的成果。相反,如果模型在培训过程中能够关注小型但基本的数据,则对CCC模式有益。此外,我们提出了一个简单而有效的培训战略,称为Onetarget, 以提高CECC模型的重点能力,从而改进CECC的绩效。广泛的实验和详细分析表明我们发现的方法的正确性和我们拟议方法的有效性。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/