ChatGPT shows remarkable capabilities for machine translation (MT). Several prior studies have shown that it achieves comparable results to commercial systems for high-resource languages, but lags behind in complex tasks, e.g, low-resource and distant-language-pairs translation. However, they usually adopt simple prompts which can not fully elicit the capability of ChatGPT. In this report, we aim to further mine ChatGPT's translation ability by revisiting several aspects: temperature, task information, and domain information, and correspondingly propose two (simple but effective) prompts: Task-Specific Prompts (TSP) and Domain-Specific Prompts (DSP). We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information further improves ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community. We also explore the effects of advanced in-context learning strategies and find a (negative but interesting) observation: the powerful chain-of-thought prompt leads to word-by-word translation behavior, thus bringing significant translation degradation.
翻译:在高级资源语言翻译方面,ChatGPT 展现了出色的机器翻译能力。多项研究表明,它在可比较的结果方面与商业系统相当,但在复杂任务中表现不佳,例如低资源和远距离语言对翻译。然而,以前的研究通常采用简单的提示词,不能完全引发 ChatGPT 的能力。本文旨在通过重新审视几个方面来进一步发掘 ChatGPT 的翻译能力:温度、任务信息和领域信息,并相应地提出两个(简单但有效的)提示:任务特定提示(TSP)和领域特定提示(DSP)。我们发现:1)ChatGPT 的性能很大程度上取决于温度,较低的温度通常可以实现更好的性能;2)强调任务信息可以进一步提高 ChatGPT 的性能,特别是在复杂的机器翻译任务中;3)引入领域信息可以引发 ChatGPT 的泛化能力,并提高其在特定领域中的性能;4)ChatGPT 倾向于在非英语中心的机器翻译任务中产生幻觉,这可以部分通过我们提出的提示来解决,但仍需要在机器翻译/自然语言处理社区中加以强调。我们还探讨了高级上下文学习策略的影响,并发现了一个(负面但有趣的)观察结果:强大的思维链式提示会导致逐字逐句的翻译行为,从而带来显著的翻译退化。