利用模糊集束法分析单嵌入式单词分析 (Analysis of Word Embeddings Using Fuzzy Clustering) - 专知论文

会员服务 ·

0

模糊聚类 · 簇 · 向量化 · 词向量表示 · Performer ·

2020 年 12 月 7 日

Analysis of Word Embeddings Using Fuzzy Clustering

翻译：利用模糊集束法分析单嵌入式单词分析

Shahin Atakishiyev,Marek Z. Reformat

from arxiv, 6 pages

In data dominated systems and applications, a concept of representing words in a numerical format has gained a lot of attention. There are a few approaches used to generate such a representation. An interesting issue that should be considered is the ability of such representations - called embeddings - to imitate human-based semantic similarity between words. In this study, we perform a fuzzy-based analysis of vector representations of words, i.e., word embeddings. We use two popular fuzzy clustering algorithms on count-based word embeddings, known as GloVe, of different dimensionality. Words from WordSim-353, called the gold standard, are represented as vectors and clustered. The results indicate that fuzzy clustering algorithms are very sensitive to high-dimensional data, and parameter tuning can dramatically change their performance. We show that by adjusting the value of the fuzzifier parameter, fuzzy clustering can be successfully applied to vectors of high - up to one hundred - dimensions. Additionally, we illustrate that fuzzy clustering allows to provide interesting results regarding membership of words to different clusters.

翻译：在数据占主导地位的系统和应用程序中,以数字格式代表单词的概念引起了人们的极大关注。使用了一些方法来产生这种代表。一个值得考虑的有趣问题是,这种代表(称为嵌入式)是否有能力模仿基于人类的词义相似性。在这项研究中,我们对单词的矢量表示方式进行了模糊分析,即字嵌入。我们用两种流行的、叫做GloVe的基于数字的字嵌入式模糊组合算法,称为GloVe,不同维度。 WordSim-353(称为黄金标准)的单词是矢量和组合。结果表明,模糊的组合算法对高维数据非常敏感,参数调整可以大大改变它们的性能。我们表明,通过调整模糊参数的价值,模糊的组合可以成功地应用到高至100维的矢量。此外,我们说明,模糊的组合可以在不同组群群的单词归属方面提供有趣的结果。

0

相关内容

模糊聚类

采用模糊数学语言对事物按一定的要求进行描述和分类的数学方法称为模糊聚类分析。

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

专知会员服务

69+阅读 · 2020年6月19日

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

专知会员服务

240+阅读 · 2020年1月21日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】深度学习情感分析综述

【推荐】深度学习情感分析综述

机器学习研究会

58+阅读 · 2018年1月26日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

FrameAxis: Characterizing Framing Bias and Intensity with Word Embedding

Arxiv

0+阅读 · 2021年2月6日

An analysis of the graph processing landscape

An analysis of the graph processing landscape

Arxiv

0+阅读 · 2021年2月5日

Vine copula mixture models and clustering for non-Gaussian data

Vine copula mixture models and clustering for non-Gaussian data

Arxiv

0+阅读 · 2021年2月5日

Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank

Arxiv

6+阅读 · 2020年8月20日

Combination of Domain Knowledge and Deep Learning for Sentiment Analysis

Arxiv

3+阅读 · 2018年6月22日

Characterizing Departures from Linearity in Word Translation

Arxiv

3+阅读 · 2018年6月7日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

Improving Sentiment Analysis in Arabic Using Word Representation

Arxiv

4+阅读 · 2018年2月28日

SpectralNet: Spectral Clustering using Deep Neural Networks

Arxiv

11+阅读 · 2018年1月10日

One-shot and few-shot learning of word embeddings

Arxiv

5+阅读 · 2017年10月27日

VIP会员

文章信息

相关主题

词向量表示

相关VIP内容

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

专知会员服务

69+阅读 · 2020年6月19日

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

专知会员服务

240+阅读 · 2020年1月21日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

领域特定文本分类中的预训练语言模型新进展：系统综述

实时无人机指令处理：一种面向无人机系统的大语言模型方法

领域特定文本分类中的预训练语言模型新进展：系统综述

大型语言模型（LLM）赋能的知识图谱构建：综述

相关资讯

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】深度学习情感分析综述

【推荐】深度学习情感分析综述

机器学习研究会

58+阅读 · 2018年1月26日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

FrameAxis: Characterizing Framing Bias and Intensity with Word Embedding

Arxiv

0+阅读 · 2021年2月6日

An analysis of the graph processing landscape

An analysis of the graph processing landscape

Arxiv

0+阅读 · 2021年2月5日

Vine copula mixture models and clustering for non-Gaussian data

Vine copula mixture models and clustering for non-Gaussian data

Arxiv

0+阅读 · 2021年2月5日

Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank

Arxiv

6+阅读 · 2020年8月20日

Combination of Domain Knowledge and Deep Learning for Sentiment Analysis

Arxiv

3+阅读 · 2018年6月22日

Characterizing Departures from Linearity in Word Translation

Arxiv

3+阅读 · 2018年6月7日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

Improving Sentiment Analysis in Arabic Using Word Representation

Arxiv

4+阅读 · 2018年2月28日

SpectralNet: Spectral Clustering using Deep Neural Networks

Arxiv

11+阅读 · 2018年1月10日

One-shot and few-shot learning of word embeddings

Arxiv

5+阅读 · 2017年10月27日

微信扫码咨询专知VIP会员