Sensitivity and Robustness of Large Language Models to Prompt in Japanese - 专知论文

会员服务 ·

0

MoDELS · 语言模型化 · Performer · Prompt · 稳健性 ·

2023 年 5 月 15 日

Sensitivity and Robustness of Large Language Models to Prompt in Japanese

翻译：暂无翻译

Chengguang Gan,Tatsunori Mori

from arxiv, 9 pages, 9 figures

Prompt Engineering has gained significant relevance in recent years, fueled by advancements in pre-trained and large language models. However, a critical issue has been identified within this domain: the lack of sensitivity and robustness of these models towards Prompt Templates, particularly in lesser-studied languages such as Japanese. This paper explores this issue through a comprehensive evaluation of several representative Large Language Models (LLMs) and a widely-utilized pre-trained model(PLM), T5. These models are scrutinized using a benchmark dataset in Japanese, with the aim to assess and analyze the performance of the current multilingual models in this context. Our experimental results reveal startling discrepancies. A simple modification in the sentence structure of the Prompt Template led to a drastic drop in the accuracy of GPT-4 from 49.21 to 25.44. This observation underscores the fact that even the highly performance GPT-4 model encounters significant stability issues when dealing with diverse Japanese prompt templates, rendering the consistency of the model's output results questionable. In light of these findings, we conclude by proposing potential research trajectories to further enhance the development and performance of Large Language Models in their current stage.

翻译：暂无翻译

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

碳纤维树脂基复合材料电热损伤机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

传染性法氏囊病病毒VP4蛋白与GILZ相互作用抑制I型干扰素表达分子机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Doublesex基因在对虾性别决定和分化中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

轻稀土RE-Ni-Fe体系相图及相关合金微波吸收机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白精氨酸甲基转移酶5在儿童急性淋巴细胞白血病细胞表达改变在白血病发病作用的初探以及对个体化治疗的指

国家自然科学基金

0+阅读 · 2011年12月31日

玉米大斑病菌水甘油通道蛋白StFps1基因的克隆与功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机延时神经网络的动力学分析

国家自然科学基金

0+阅读 · 2008年12月31日

柑橘砧木耐缺硼胁迫能力差异的分子机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Biomedical Language Models are Robust to Sub-optimal Tokenization

Arxiv

0+阅读 · 2023年6月30日

Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering

Arxiv

0+阅读 · 2023年6月30日

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

Arxiv

0+阅读 · 2023年6月30日

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

Arxiv

0+阅读 · 2023年6月29日

Improving Identity-Robustness for Face Models

Arxiv

0+阅读 · 2023年6月29日

Toward Pioneering Sensors and Features Using Large Language Models in Human Activity Recognition

Arxiv

0+阅读 · 2023年6月28日

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

Arxiv

0+阅读 · 2023年6月28日

A Survey on Large Language Models for Recommendation

Arxiv

12+阅读 · 2023年5月31日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Arxiv

12+阅读 · 2023年4月26日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Biomedical Language Models are Robust to Sub-optimal Tokenization

Arxiv

0+阅读 · 2023年6月30日

Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering

Arxiv

0+阅读 · 2023年6月30日

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

Arxiv

0+阅读 · 2023年6月30日

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

Arxiv

0+阅读 · 2023年6月29日

Improving Identity-Robustness for Face Models

Arxiv

0+阅读 · 2023年6月29日

Toward Pioneering Sensors and Features Using Large Language Models in Human Activity Recognition

Arxiv

0+阅读 · 2023年6月28日

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

Arxiv

0+阅读 · 2023年6月28日

A Survey on Large Language Models for Recommendation

Arxiv

12+阅读 · 2023年5月31日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Arxiv

12+阅读 · 2023年4月26日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

相关基金

碳纤维树脂基复合材料电热损伤机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

传染性法氏囊病病毒VP4蛋白与GILZ相互作用抑制I型干扰素表达分子机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Doublesex基因在对虾性别决定和分化中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

轻稀土RE-Ni-Fe体系相图及相关合金微波吸收机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白精氨酸甲基转移酶5在儿童急性淋巴细胞白血病细胞表达改变在白血病发病作用的初探以及对个体化治疗的指

国家自然科学基金

0+阅读 · 2011年12月31日

玉米大斑病菌水甘油通道蛋白StFps1基因的克隆与功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机延时神经网络的动力学分析

国家自然科学基金

0+阅读 · 2008年12月31日

柑橘砧木耐缺硼胁迫能力差异的分子机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员