ChatGPT, a large-scale language model based on the advanced GPT-3.5 architecture, has shown remarkable potential in various Natural Language Processing (NLP) tasks. However, there is currently a dearth of comprehensive study exploring its potential in the area of Grammatical Error Correction (GEC). To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT. Our evaluation involves assessing ChatGPT's performance on five official test sets in three different languages, along with three document-level GEC test sets in English. Our experimental results and human evaluations demonstrate that ChatGPT has excellent error detection capabilities and can freely correct errors to make the corrected sentences very fluent, possibly due to its over-correction tendencies and not adhering to the principle of minimal edits. Additionally, its performance in non-English and low-resource settings highlights its potential in multilingual GEC tasks. However, further analysis of various types of errors at the document-level has shown that ChatGPT cannot effectively correct agreement, coreference, tense errors across sentences, and cross-sentence boundary errors.
翻译:ChatGPT是基于先进的GPT-3.5架构的大规模语言模型,在各种自然语言处理(NLP)任务中展现了卓越的潜力。然而,目前缺乏综合研究来探索它在语法错误纠正(GEC)领域的潜力。为了展示它在GEC方面的能力,我们设计了使用上下文学习的零样本CoT(chain-of-thought)和少样本CoT设置用于ChatGPT。我们的评估涉及对五个不同语言的官方测试集以及英语中三个文档级GEC测试集的ChatGPT性能进行评估。我们的实验结果和人类评估表明,ChatGPT在错误检测方面具有出色的能力,并且可以自由地纠正错误,使更正后的句子非常流畅,可能是因为其过度纠正倾向而不遵守最小修改原则。此外,它在非英语和低资源环境中的表现突出,突出显示了它在多语言GEC任务中的潜力。然而,对文档级各种类型错误的进一步分析表明,ChatGPT不能有效地纠正跨句子的协议、指代和时态错误及跨句子边界的错误。