大型语言模型有效地利用文档级上下文进行文学翻译，但关键错误仍然存在 (Large language models effectively leverage document-level context for literary translation, but critical errors persist) - 专知论文

会员服务 ·

0

大型语言模型 · 注释（编程） · 语言模型 · 上下文 · 片段 ·

2023 年 4 月 7 日

Large language models effectively leverage document-level context for literary translation, but critical errors persist

翻译：大型语言模型有效地利用文档级上下文进行文学翻译，但关键错误仍然存在

Marzena Karpinska,Mohit Iyyer

from arxiv, preprint (30 pages)

Large language models (LLMs) are competitive with the state of the art on a wide range of sentence-level translation datasets. However, their ability to translate paragraphs and documents remains unexplored because evaluation in these settings is costly and difficult. We show through a rigorous human evaluation that asking the Gpt-3.5 (text-davinci-003) LLM to translate an entire literary paragraph (e.g., from a novel) at once results in higher-quality translations than standard sentence-by-sentence translation across 18 linguistically-diverse language pairs (e.g., translating into and out of Japanese, Polish, and English). Our evaluation, which took approximately 350 hours of effort for annotation and analysis, is conducted by hiring translators fluent in both the source and target language and asking them to provide both span-level error annotations as well as preference judgments of which system's translations are better. We observe that discourse-level LLM translators commit fewer mistranslations, grammar errors, and stylistic inconsistencies than sentence-level approaches. With that said, critical errors still abound, including occasional content omissions, and a human translator's intervention remains necessary to ensure that the author's voice remains intact. We publicly release our dataset and error annotations to spur future research on evaluation of document-level literary translation.

翻译：大型语言模型（LLMs）在广泛的句子级翻译数据集上与最先进技术竞争。然而，它们在翻译段落和文档方面的能力仍未被探索，因为在这些环境中进行评估是昂贵且困难的。通过经过严格的人工评估，我们表明要求Gpt-3.5（text-davinci-003）LLM一次翻译整个文学段落（例如，从小说中）所产生的翻译比跨18种语言（例如，将日语，波兰语和英语翻译成和翻译出）进行标准逐句翻译会产生更高质量的翻译。我们的评估大约需要350小时的工作量进行注释和分析，通过招募精通源语言和目标语言的翻译人员，并要求他们提供跨度级别的错误注释以及判断哪个系统的翻译更好。我们观察到，基于文本片段实现的LLM翻译人员比基于句子的方法在语篇级别上出现更少的误译现象，语法错误和文体不一致问题。但是，关键错误仍然存在，包括偶尔的内容省略，人类翻译干预仍然是必要的，以确保作者的声音保持不变。我们公开发布我们的数据集和错误注释，以推动未来针对文档级文学翻译评估的研究。

0

相关内容

大型语言模型

大型语言模型

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

专知会员服务

19+阅读 · 2019年11月18日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

最新NLP论文阅读列表，包括对话、问答、摘要、翻译等（附资源）

最新NLP论文阅读列表，包括对话、问答、摘要、翻译等（附资源）

THU数据派

11+阅读 · 2019年3月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

【论文推荐】最新五篇信息抽取相关论文—端到端深度模型、调研、聊天机器人、自注意力、科学文本

【论文推荐】最新五篇信息抽取相关论文—端到端深度模型、调研、聊天机器人、自注意力、科学文本

专知

13+阅读 · 2018年4月4日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

基于篇章语义的文档级统计机器翻译研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

RIP2调控CD40-NF-кB信号通路在血管内皮细胞损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

汉语句法分析中的自动歧义识别和分类问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

Wnt/β-catenin通路介导RELMβ调控糖尿病肾病系膜细胞增殖的机制研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于海量语料自然标注信息的汉语自然语块分析

国家自然科学基金

0+阅读 · 2013年12月31日

跨语言信息检索中的机器翻译研究

国家自然科学基金

2+阅读 · 2011年12月31日

纳西-汉语双语语料库构建与翻译方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

ERK1/2，p38MAPK在CD24调节结直肠癌增殖和侵袭过程中的作用及调控的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Science in the Era of ChatGPT, Large Language Models and AI: Challenges for Research Ethics Review and How to Respond

Arxiv

0+阅读 · 2023年5月24日

Boosting Cross-lingual Transferability in Multilingual Models via In-Context Learning

Arxiv

0+阅读 · 2023年5月24日

Not All Metrics Are Guilty: Improving NLG Evaluation with LLM Paraphrasing

Arxiv

0+阅读 · 2023年5月24日

Hierarchical Prompting Assists Large Language Model on Web Navigation

Arxiv

0+阅读 · 2023年5月23日

Large Language Models are Better Reasoners with Self-Verification

Arxiv

0+阅读 · 2023年5月23日

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning

Arxiv

0+阅读 · 2023年5月23日

Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding

Arxiv

0+阅读 · 2023年5月22日

Towards Expert-Level Medical Question Answering with Large Language Models

Arxiv

26+阅读 · 2023年5月16日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

476+阅读 · 2023年3月31日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

大型语言模型

注释（编程）

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

专知会员服务

19+阅读 · 2019年11月18日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

最新NLP论文阅读列表，包括对话、问答、摘要、翻译等（附资源）

最新NLP论文阅读列表，包括对话、问答、摘要、翻译等（附资源）

THU数据派

11+阅读 · 2019年3月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

【论文推荐】最新五篇信息抽取相关论文—端到端深度模型、调研、聊天机器人、自注意力、科学文本

【论文推荐】最新五篇信息抽取相关论文—端到端深度模型、调研、聊天机器人、自注意力、科学文本

专知

13+阅读 · 2018年4月4日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

相关论文

Science in the Era of ChatGPT, Large Language Models and AI: Challenges for Research Ethics Review and How to Respond

Arxiv

0+阅读 · 2023年5月24日

Boosting Cross-lingual Transferability in Multilingual Models via In-Context Learning

Arxiv

0+阅读 · 2023年5月24日

Not All Metrics Are Guilty: Improving NLG Evaluation with LLM Paraphrasing

Arxiv

0+阅读 · 2023年5月24日

Hierarchical Prompting Assists Large Language Model on Web Navigation

Arxiv

0+阅读 · 2023年5月23日

Large Language Models are Better Reasoners with Self-Verification

Arxiv

0+阅读 · 2023年5月23日

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning

Arxiv

0+阅读 · 2023年5月23日

Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding

Arxiv

0+阅读 · 2023年5月22日

Towards Expert-Level Medical Question Answering with Large Language Models

Arxiv

26+阅读 · 2023年5月16日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

476+阅读 · 2023年3月31日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

相关基金

基于篇章语义的文档级统计机器翻译研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

RIP2调控CD40-NF-кB信号通路在血管内皮细胞损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

汉语句法分析中的自动歧义识别和分类问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

Wnt/β-catenin通路介导RELMβ调控糖尿病肾病系膜细胞增殖的机制研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于海量语料自然标注信息的汉语自然语块分析

国家自然科学基金

0+阅读 · 2013年12月31日

跨语言信息检索中的机器翻译研究

国家自然科学基金

2+阅读 · 2011年12月31日

纳西-汉语双语语料库构建与翻译方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

ERK1/2，p38MAPK在CD24调节结直肠癌增殖和侵袭过程中的作用及调控的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员