《多种语言LMs:解决冲突文学冲突和波罗的海语言案例研究》中的判刑代表相似: (Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages) - 专知论文

会员服务 ·

0

Analysis · Performer · 相似度 · CASE · Projection ·

2022 年 8 月 14 日

Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages

翻译：《多种语言LMs:解决冲突文学冲突和波罗的海语言案例研究》中的判刑代表相似:

Maksym Del,Mark Fishel

from arxiv, 11 pages, 9 figures; to appear at http://hlt2022.tilde.eu/

Low-resource languages, such as Baltic languages, benefit from Large Multilingual Models (LMs) that possess remarkable cross-lingual transfer performance capabilities. This work is an interpretation and analysis study into cross-lingual representations of Multilingual LMs. Previous works hypothesized that these LMs internally project representations of different languages into a shared cross-lingual space. However, the literature produced contradictory results. In this paper, we revisit the prior work claiming that "BERT is not an Interlingua" and show that different languages do converge to a shared space in such language models with another choice of pooling strategy or similarity index. Then, we perform cross-lingual representational analysis for the two most popular multilingual LMs employing 378 pairwise language comparisons. We discover that while most languages share joint cross-lingual space, some do not. However, we observe that Baltic languages do belong to that shared space. The code is available at https://github.com/TartuNLP/xsim.

翻译：波罗的海语言等低资源语言受益于具有卓越的跨语言转让性能的大型多语言模式(LMs),这项工作是对多语言LM的跨语言代表性的口译和分析研究。以前的工作假设,这些LMs内部项目将不同语言纳入一个共同的跨语言空间。然而,文献产生了自相矛盾的结果。在本文中,我们重新审视先前的工作,声称“BERT不是一个Interlingua”,并表明不同语言的确会与这种语言模式中的共享空间相汇合,而采用另一种选择的集合战略或类似指数。然后,我们对两种最受欢迎的多语言LMs进行了跨语言代表性分析,使用378对对口语言比较。我们发现,虽然大多数语言共享共同的跨语言空间,但有些没有。但我们认为,波罗的海语言属于这一共享空间。该代码可在https://github.com/TartuNP/xsim查阅。

0

相关内容

Analysis

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

组蛋白和DNA甲基化相互作用调控内质网应激通路介导同型半胱氨酸致肝脏脂代谢紊乱的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

插层与Te元素掺杂对FeSe超导体系磁通钉扎机制的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

p66shc-ROS轴介导猪早期胚胎体外发育阻滞机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-15b/16-2调控原发性痛经的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

uPAR调控肿瘤转移新途径的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

青藏高原表层土壤湿度卫星微波遥感研究

国家自然科学基金

0+阅读 · 2012年12月31日

伏隔核睡眠-觉醒调节机制

国家自然科学基金

0+阅读 · 2011年12月31日

“#32511;洲—#33618;漠”#23707;屿生态种群的扩散模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

XDoc: Unified Pre-training for Cross-Format Document Understanding

Arxiv

0+阅读 · 2022年10月6日

Adversarial Attack and Defense on Graph Data: A Survey

Arxiv

8+阅读 · 2022年10月6日

On the Use of Deep Learning in Software Defect Prediction

Arxiv

0+阅读 · 2022年10月5日

Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes

Arxiv

0+阅读 · 2022年10月3日

Investigating Black-Box Function Recognition Using Hardware Performance Counters

Arxiv

0+阅读 · 2022年10月3日

Peer-to-Peer Sharing of Energy Storage Systems under Net Metering and Time-of-Use Pricing

Arxiv

0+阅读 · 2022年10月1日

Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding

Arxiv

0+阅读 · 2022年9月30日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

An Attentive Survey of Attention Models

Arxiv

19+阅读 · 2019年4月5日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

XDoc: Unified Pre-training for Cross-Format Document Understanding

Arxiv

0+阅读 · 2022年10月6日

Adversarial Attack and Defense on Graph Data: A Survey

Arxiv

8+阅读 · 2022年10月6日

On the Use of Deep Learning in Software Defect Prediction

Arxiv

0+阅读 · 2022年10月5日

Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes

Arxiv

0+阅读 · 2022年10月3日

Investigating Black-Box Function Recognition Using Hardware Performance Counters

Arxiv

0+阅读 · 2022年10月3日

Peer-to-Peer Sharing of Energy Storage Systems under Net Metering and Time-of-Use Pricing

Arxiv

0+阅读 · 2022年10月1日

Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding

Arxiv

0+阅读 · 2022年9月30日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

An Attentive Survey of Attention Models

Arxiv

19+阅读 · 2019年4月5日

相关基金

组蛋白和DNA甲基化相互作用调控内质网应激通路介导同型半胱氨酸致肝脏脂代谢紊乱的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

插层与Te元素掺杂对FeSe超导体系磁通钉扎机制的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

p66shc-ROS轴介导猪早期胚胎体外发育阻滞机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-15b/16-2调控原发性痛经的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

uPAR调控肿瘤转移新途径的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

青藏高原表层土壤湿度卫星微波遥感研究

国家自然科学基金

0+阅读 · 2012年12月31日

伏隔核睡眠-觉醒调节机制

国家自然科学基金

0+阅读 · 2011年12月31日

“#32511;洲—#33618;漠”#23707;屿生态种群的扩散模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员