ParroT：使用大型语言模型在聊天时进行翻译 (ParroT: Translating During Chat Using Large Language Models) - 专知论文

会员服务 ·

0

大型语言模型 · 语言模型 · 机器翻译 · 微调 · 低秩 ·

2023 年 4 月 20 日

ParroT: Translating During Chat Using Large Language Models

翻译：ParroT：使用大型语言模型在聊天时进行翻译

Wenxiang Jiao,Jen-tse Huang,Wenxuan Wang,Xing Wang,Shuming Shi,Zhaopeng Tu

from arxiv, 10 pages; added BLOOMZ-7b-mt results; added LoRA tuning for LLaMA-7b

Large language models (LLMs) like ChatGPT and GPT-4 have exhibited remarkable abilities on a wide range of natural language processing (NLP) tasks, including various machine translation abilities accomplished during chat. However, these models are only accessible through restricted APIs, which creates barriers to new research and advancements in the field. Therefore, we propose the $\mathbf{ParroT}$ framework to enhance and regulate the translation abilities during chat based on open-sourced LLMs (i.e., LLaMA-7b, BLOOMZ-7b-mt) and human written translation and evaluation data. Specifically, ParroT reformulates translation data into the instruction-following style, and introduces a "$\mathbf{Hint}$" field for incorporating extra requirements to regulate the translation process. Accordingly, we propose three instruction types for finetuning ParroT models, including translation instruction, contrastive instruction, and error-guided instruction. We can finetune either the full models or partial parameters via low rank adaptation (LoRA). Experiments on Flores subsets and WMT22 test sets suggest that translation instruction improves the translation performance of vanilla LLMs significantly while error-guided instruction can lead to a further improvement, which demonstrates the importance of learning from low-quality translations annotated by human. Meanwhile, the ParroT models can also preserve the ability on general tasks with the Alpaca multi-task dataset involved in finetuning. Please refer to our Github project for more implementation details: https://github.com/wxjiao/ParroT

翻译：大型语言模型（LLMs）如ChatGPT和GPT-4在各种自然语言处理（NLP）任务中展示出了卓越的能力，包括在聊天过程中完成的各种机器翻译任务。然而，这些模型只能通过受限制的API访问，这为新研究和领域进展带来了障碍。因此，我们提出了$\mathbf{ParroT}$框架，该框架基于开源LLMs（即LLaMA-7b、BLOOMZ-7b-mt）和人工编写的翻译和评估数据，增强和规范了聊天过程中的翻译能力。具体来说，ParroT将翻译数据重新组织成指令式文本，并引入了“$\mathbf{Hint}$”字段，用于加入额外的要求以规定翻译过程。因此，我们提出了三种指令类型，包括翻译指令、对比指令和误差引导指令，以通过低秩适应（LoRA）对ParroT模型进行全模型或部分参数微调。在Flores子集和WMT22测试集上的实验证明，翻译指令显著提高了原始LLMs的翻译性能，而误差引导指令可以进一步提高翻译性能，这表明学习人工注释的低质量翻译的重要性。同时，ParroT模型在涉及到Alpaca多任务数据集的微调中也能保持通用任务能力。更多的实现详情请参见我们的Github项目：https://github.com/wxjiao/ParroT

0

相关内容

大型语言模型

大型语言模型

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

专知会员服务

87+阅读 · 2023年5月10日

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

专知会员服务

181+阅读 · 2023年4月4日

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

专知会员服务

521+阅读 · 2022年1月31日

【ICML2021】通过文本生成统一视觉和语言任务

专知会员服务

19+阅读 · 2021年9月13日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

专知会员服务

85+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Github项目推荐 | Chatito - 使用简单的DSL为AI聊天机器人、NLP任务、命名实体识别或文本分类模型生成数据集

Github项目推荐 | Chatito - 使用简单的DSL为AI聊天机器人、NLP任务、命名实体识别或文本分类模型生成数据集

AI研习社

13+阅读 · 2019年1月21日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

CaR在TMJ骨关节炎关节软骨细胞异常增殖与分化中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

修饰化量子点CdTe QDs对人体细胞的毒性评价研究

国家自然科学基金

0+阅读 · 2014年12月31日

S100P上调AGTR1实现自我激活从而促进绒毛膜癌的生长和转移

国家自然科学基金

0+阅读 · 2013年12月31日

基于单语语料的无监督统计机器翻译模型研究

国家自然科学基金

1+阅读 · 2013年12月31日

不同水平的Cyclophilin-D蛋白对T-2毒素诱导TM3细胞凋亡的影响

国家自然科学基金

0+阅读 · 2013年12月31日

Epimorphin调控的miR-107在肝癌侵袭和转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于多形式安全策略的网络可达性建模、查询及优化技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的统计机器翻译方法研究

国家自然科学基金

2+阅读 · 2010年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

Analysis of ChatGPT on Source Code

Arxiv

1+阅读 · 2023年6月6日

Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Arxiv

0+阅读 · 2023年6月5日

PULSAR: Pre-training with Extracted Healthcare Terms for Summarising Patients' Problems and Data Augmentation with Black-box Large Language Models

Arxiv

0+阅读 · 2023年6月5日

Exposing Bias in Online Communities through Large-Scale Language Models

Arxiv

0+阅读 · 2023年6月4日

Prompting Is All Your Need: Automated Android Bug Replay with Large Language Models

Arxiv

0+阅读 · 2023年6月3日

ThinkSum: Probabilistic reasoning over sets using large language models

Arxiv

0+阅读 · 2023年6月2日

Bag of Tricks for Training Data Extraction from Language Models

Arxiv

0+阅读 · 2023年6月1日

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Arxiv

0+阅读 · 2023年5月31日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

477+阅读 · 2023年3月31日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

VIP会员

文章信息

相关主题

大型语言模型

相关VIP内容

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

专知会员服务

87+阅读 · 2023年5月10日

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

专知会员服务

181+阅读 · 2023年4月4日

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

专知会员服务

521+阅读 · 2022年1月31日

【ICML2021】通过文本生成统一视觉和语言任务

专知会员服务

19+阅读 · 2021年9月13日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

专知会员服务

85+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Github项目推荐 | Chatito - 使用简单的DSL为AI聊天机器人、NLP任务、命名实体识别或文本分类模型生成数据集

Github项目推荐 | Chatito - 使用简单的DSL为AI聊天机器人、NLP任务、命名实体识别或文本分类模型生成数据集

AI研习社

13+阅读 · 2019年1月21日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Analysis of ChatGPT on Source Code

Arxiv

1+阅读 · 2023年6月6日

Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Arxiv

0+阅读 · 2023年6月5日

PULSAR: Pre-training with Extracted Healthcare Terms for Summarising Patients' Problems and Data Augmentation with Black-box Large Language Models

Arxiv

0+阅读 · 2023年6月5日

Exposing Bias in Online Communities through Large-Scale Language Models

Arxiv

0+阅读 · 2023年6月4日

Prompting Is All Your Need: Automated Android Bug Replay with Large Language Models

Arxiv

0+阅读 · 2023年6月3日

ThinkSum: Probabilistic reasoning over sets using large language models

Arxiv

0+阅读 · 2023年6月2日

Bag of Tricks for Training Data Extraction from Language Models

Arxiv

0+阅读 · 2023年6月1日

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Arxiv

0+阅读 · 2023年5月31日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

477+阅读 · 2023年3月31日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

相关基金

CaR在TMJ骨关节炎关节软骨细胞异常增殖与分化中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

修饰化量子点CdTe QDs对人体细胞的毒性评价研究

国家自然科学基金

0+阅读 · 2014年12月31日

S100P上调AGTR1实现自我激活从而促进绒毛膜癌的生长和转移

国家自然科学基金

0+阅读 · 2013年12月31日

基于单语语料的无监督统计机器翻译模型研究

国家自然科学基金

1+阅读 · 2013年12月31日

不同水平的Cyclophilin-D蛋白对T-2毒素诱导TM3细胞凋亡的影响

国家自然科学基金

0+阅读 · 2013年12月31日

Epimorphin调控的miR-107在肝癌侵袭和转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于多形式安全策略的网络可达性建模、查询及优化技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的统计机器翻译方法研究

国家自然科学基金

2+阅读 · 2010年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员