How do languages influence each other? Studying cross-lingual data sharing during LLM fine-tuning - 专知论文

会员服务 ·

0

Performer · MoDELS · 语言模型化 · 训练样本 · 模型性能 ·

2023 年 5 月 22 日

How do languages influence each other? Studying cross-lingual data sharing during LLM fine-tuning

翻译：暂无翻译

Rochelle Choenni,Dan Garrette,Ekaterina Shutova

Multilingual large language models (MLLMs) are jointly trained on data from many different languages such that representation of individual languages can benefit from other languages' data. Impressive performance on zero-shot cross-lingual transfer shows that these models are capable of exploiting data from other languages. Yet, it remains unclear to what extent, and under which conditions, languages rely on each other's data. In this study, we use TracIn (Pruthi et al., 2020), a training data attribution (TDA) method, to retrieve the most influential training samples seen during multilingual fine-tuning for a particular test language. This allows us to analyse cross-lingual sharing mechanisms of MLLMs from a new perspective. While previous work studied cross-lingual sharing at the level of model parameters, we present the first approach to study cross-lingual sharing at the data level. We find that MLLMs rely on data from multiple languages from the early stages of fine-tuning and that this reliance gradually increases as fine-tuning progresses. We further study how different fine-tuning languages influence model performance on a given test language and find that they can both reinforce and complement the knowledge acquired from data of the test language itself.

翻译：暂无翻译

0

相关内容

Performer

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

壁面物理结构对近壁湍流与颗粒相互作用影响的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

FGF9在关节软骨稳态维持及OA中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

草原生态系统碳循环对降雨的响应阈值及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

菌根提高灌木铁线莲在内蒙古大青山干旱石质阳坡生态适应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

电场作用下单个沸腾汽泡的成核机理与换热特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment

Arxiv

0+阅读 · 2023年7月7日

Timing Analysis of Embedded Software Updates

Arxiv

0+阅读 · 2023年7月7日

An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code

Arxiv

0+阅读 · 2023年7月5日

Learning to Prompt in the Classroom to Understand AI Limits: A pilot study

Arxiv

0+阅读 · 2023年7月4日

Towards Expert-Level Medical Question Answering with Large Language Models

Arxiv

26+阅读 · 2023年5月16日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体工程（Agent Engineering）

《全球地缘政治环境中的反无人机系统互操作性》252页

专业软件开发者不靠“氛围编程”（Vibe Coding），而靠“控制”：2025 年 AI Agent 在编程中的应用研究

基于大语言模型的智能体化软件问题解决：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment

Arxiv

0+阅读 · 2023年7月7日

Timing Analysis of Embedded Software Updates

Arxiv

0+阅读 · 2023年7月7日

An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code

Arxiv

0+阅读 · 2023年7月5日

Learning to Prompt in the Classroom to Understand AI Limits: A pilot study

Arxiv

0+阅读 · 2023年7月4日

Towards Expert-Level Medical Question Answering with Large Language Models

Arxiv

26+阅读 · 2023年5月16日

相关基金

壁面物理结构对近壁湍流与颗粒相互作用影响的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

FGF9在关节软骨稳态维持及OA中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

草原生态系统碳循环对降雨的响应阈值及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

菌根提高灌木铁线莲在内蒙古大青山干旱石质阳坡生态适应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

电场作用下单个沸腾汽泡的成核机理与换热特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员