Word embeddings are one of the most fundamental technologies used in natural language processing. Existing word embeddings are high-dimensional and consume considerable computational resources. In this study, we propose WordTour, unsupervised one-dimensional word embeddings. To achieve the challenging goal, we propose a decomposition of the desiderata of word embeddings into two parts, completeness and soundness, and focus on soundness in this paper. Owing to the single dimensionality, WordTour is extremely efficient and provides a minimal means to handle word embeddings. We experimentally confirmed the effectiveness of the proposed method via user study and document classification.
翻译:单词嵌入是自然语言处理中使用的最根本技术之一。 现有的单词嵌入是高维的, 消耗了大量的计算资源。 在本研究中, 我们提出WordTour, 不受监督的单维字嵌入。 为了实现挑战性的目标, 我们提议将单词嵌入的分解分为两个部分, 完整性和正确性, 并关注本文的正确性。 由于单维性, WordTour 极为高效, 并且提供了处理单词嵌入的最起码的手段 。 我们通过用户研究和文件分类, 实验地确认了拟议方法的有效性 。