### 相关内容

Multilingual Word Embeddings (MWEs) represent words from multiple languages in a single distributional vector space. Unsupervised MWE (UMWE) methods acquire multilingual embeddings without cross-lingual supervision, which is a significant advantage over traditional supervised approaches and opens many new possibilities for low-resource languages. Prior art for learning UMWEs, however, merely relies on a number of independently trained Unsupervised Bilingual Word Embeddings (UBWEs) to obtain multilingual embeddings. These methods fail to leverage the interdependencies that exist among many languages. To address this shortcoming, we propose a fully unsupervised framework for learning MWEs that directly exploits the relations between all language pairs. Our model substantially outperforms previous approaches in the experiments on multilingual word translation and cross-lingual word similarity. In addition, our model even beats supervised approaches trained with cross-lingual resources.

Ensembling word embeddings to improve distributed word representations has shown good success for natural language processing tasks in recent years. These approaches either carry out straightforward mathematical operations over a set of vectors or use unsupervised learning to find a lower-dimensional representation. This work compares meta-embeddings trained for different losses, namely loss functions that account for angular distance between the reconstructed embedding and the target and those that account normalized distances based on the vector length. We argue that meta-embeddings are better to treat the ensemble set equally in unsupervised learning as the respective quality of each embedding is unknown for upstream tasks prior to meta-embedding. We show that normalization methods that account for this such as cosine and KL-divergence objectives outperform meta-embedding trained on standard $\ell_1$ and $\ell_2$ loss on \textit{defacto} word similarity and relatedness datasets and find it outperforms existing meta-learning strategies.

AI科技评论
5+阅读 · 2019年9月1日
AI科技评论
46+阅读 · 2019年5月29日

63+阅读 · 2019年3月2日

39+阅读 · 2018年4月5日

8+阅读 · 2017年7月7日

Yu Cao,Meng Fang,Baosheng Yu,Joey Tianyi Zhou
5+阅读 · 2019年11月13日
Han-Jia Ye,Hexiang Hu,De-Chuan Zhan,Fei Sha
11+阅读 · 2018年12月10日
Sagie Benaim,Lior Wolf
5+阅读 · 2018年10月23日
Xilun Chen,Claire Cardie
3+阅读 · 2018年8月27日
James O' Neill,Danushka Bollegala
3+阅读 · 2018年8月13日
Luke Metz,Niru Maheswaranathan,Brian Cheung,Jascha Sohl-Dickstein
7+阅读 · 2018年5月23日
Sebastian Ruder,Barbara Plank
4+阅读 · 2018年4月25日
Yu-An Chung,Hung-Yi Lee,James Glass
4+阅读 · 2018年4月21日
Wonsik Kim,Bhavya Goyal,Kunal Chawla,Jungmin Lee,Keunjoo Kwon
16+阅读 · 2018年4月2日
Krishnan Kumaran,Dimitri Papageorgiou,Yutong Chang,Minhan Li,Martin Takáč
8+阅读 · 2018年3月28日
Top