Colexifications for Bootstrapping Cross-lingual Datasets: The Case of Phonology, Concreteness, and Affectiveness - 专知论文

会员服务 ·

0

相关系数 · 音素 · 自助法/自举法 · 负相关法 · 数据集 ·

2023 年 6 月 5 日

Colexifications for Bootstrapping Cross-lingual Datasets: The Case of Phonology, Concreteness, and Affectiveness

翻译：暂无翻译

Yiyi Chen,Johannes Bjerva

from arxiv, 13 pages, 4 figures, accepted to SIGMORPHON 2023

Colexification refers to the linguistic phenomenon where a single lexical form is used to convey multiple meanings. By studying cross-lingual colexifications, researchers have gained valuable insights into fields such as psycholinguistics and cognitive sciences [Jackson et al.,2019]. While several multilingual colexification datasets exist, there is untapped potential in using this information to bootstrap datasets across such semantic features. In this paper, we aim to demonstrate how colexifications can be leveraged to create such cross-lingual datasets. We showcase curation procedures which result in a dataset covering 142 languages across 21 language families across the world. The dataset includes ratings of concreteness and affectiveness, mapped with phonemes and phonological features. We further analyze the dataset along different dimensions to demonstrate potential of the proposed procedures in facilitating further interdisciplinary research in psychology, cognitive science, and multilingual natural language processing (NLP). Based on initial investigations, we observe that i) colexifications that are closer in concreteness/affectiveness are more likely to colexify; ii) certain initial/last phonemes are significantly correlated with concreteness/affectiveness intra language families, such as /k/ as the initial phoneme in both Turkic and Tai-Kadai correlated with concreteness, and /p/ in Dravidian and Sino-Tibetan correlated with Valence; iii) the type-to-token ratio (TTR) of phonemes are positively correlated with concreteness across several language families, while the length of phoneme segments are negatively correlated with concreteness; iv) certain phonological features are negatively correlated with concreteness across languages. The dataset is made public online for further research.

翻译：暂无翻译

0

相关内容

相关系数

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

L-BM诱导的血流动力学改变对慢性心衰中自噬的调控和机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

磁盘阵列自适应可扩展构架的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Rho/ROCK信号通路的双黄连注射液致过敏样反应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肝脏树突状细胞依赖IL-27通路调控小鼠肝移植免疫耐受机制

国家自然科学基金

0+阅读 · 2011年12月31日

siRNA基因沉默与诱导双向基因治疗关节炎的软骨、滑膜生物学响应及ex vivo系统转基因在体示踪研究

国家自然科学基金

0+阅读 · 2011年12月31日

有源光纤环形腔内相位调制产生RoF超连续光源研究

国家自然科学基金

0+阅读 · 2009年12月31日

甲状腺癌中药物代谢与转运相关基因的表观遗传改变

国家自然科学基金

0+阅读 · 2009年12月31日

p75NTR对Alzheimer病Aβ20195;谢、沉积及其神经毒性作用的调控和机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于双路光相位调制光学倍频法的毫米波Radio Over Fiber系统研究

国家自然科学基金

0+阅读 · 2008年12月31日

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Arxiv

0+阅读 · 2023年7月25日

The Double-Edged Sword of Big Data and Information Technology for the Disadvantaged: A Cautionary Tale from Open Banking

Arxiv

0+阅读 · 2023年7月25日

A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining

Arxiv

0+阅读 · 2023年7月24日

The potential of LLMs for coding with low-resource and domain-specific programming languages

Arxiv

0+阅读 · 2023年7月24日

Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art

Arxiv

0+阅读 · 2023年7月23日

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

Arxiv

0+阅读 · 2023年7月23日

Quotable Signatures for Authenticating Shared Quotes

Arxiv

0+阅读 · 2023年7月21日

Dynamic Modeling and Analysis of Impact-resilient MAVs Undergoing High-speed and Large-angle Collisions with the Environment

Arxiv

0+阅读 · 2023年7月21日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

VIP会员

文章信息

相关主题

自助法/自举法

相关VIP内容

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

相关论文

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Arxiv

0+阅读 · 2023年7月25日

The Double-Edged Sword of Big Data and Information Technology for the Disadvantaged: A Cautionary Tale from Open Banking

Arxiv

0+阅读 · 2023年7月25日

A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining

Arxiv

0+阅读 · 2023年7月24日

The potential of LLMs for coding with low-resource and domain-specific programming languages

Arxiv

0+阅读 · 2023年7月24日

Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art

Arxiv

0+阅读 · 2023年7月23日

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

Arxiv

0+阅读 · 2023年7月23日

Quotable Signatures for Authenticating Shared Quotes

Arxiv

0+阅读 · 2023年7月21日

Dynamic Modeling and Analysis of Impact-resilient MAVs Undergoing High-speed and Large-angle Collisions with the Environment

Arxiv

0+阅读 · 2023年7月21日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

相关基金

L-BM诱导的血流动力学改变对慢性心衰中自噬的调控和机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

磁盘阵列自适应可扩展构架的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Rho/ROCK信号通路的双黄连注射液致过敏样反应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肝脏树突状细胞依赖IL-27通路调控小鼠肝移植免疫耐受机制

国家自然科学基金

0+阅读 · 2011年12月31日

siRNA基因沉默与诱导双向基因治疗关节炎的软骨、滑膜生物学响应及ex vivo系统转基因在体示踪研究

国家自然科学基金

0+阅读 · 2011年12月31日

有源光纤环形腔内相位调制产生RoF超连续光源研究

国家自然科学基金

0+阅读 · 2009年12月31日

甲状腺癌中药物代谢与转运相关基因的表观遗传改变

国家自然科学基金

0+阅读 · 2009年12月31日

p75NTR对Alzheimer病Aβ20195;谢、沉积及其神经毒性作用的调控和机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于双路光相位调制光学倍频法的毫米波Radio Over Fiber系统研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员