从字符序列中引入有意义的单位 (Inducing Meaningful Units from Character Sequences with Slot Attention)

Characters do not convey meaning, but sequences of characters do. We propose an unsupervised distributional method to learn the abstract meaning-bearing units in a sequence of characters. Rather than segmenting the sequence, this model discovers continuous representations of the "objects" in the sequence, using a recently proposed architecture for object discovery in images called Slot Attention. We train our model on different languages and evaluate the quality of the obtained representations with probing classifiers. Our experiments show promising results in the ability of our units to capture meaning at a higher level of abstraction.

翻译：字符不表达意思, 但字符序列是。我们建议一种不受监督的分布法, 以字符序列来学习抽象的含意单位。这个模型不是对序列进行分解, 而是在序列中发现“ 对象” 的连续表达方式, 使用最近提议的在图像“ 斯洛特注意” 中发现物体的架构。我们用不同的语言来培训我们的模型, 并且用测试分类器来评估获得的演示质量。我们的实验显示, 我们的单位有能力在更高的抽象层次上捕捉含义, 其结果很有希望。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

86+阅读 · 2020年6月23日

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日