We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies. Instead of using a predefined hierarchical structure, our approach is capable of learning word clusters with clear syntactical and semantic meaning during the language model training process. We provide experiments on standard benchmarks for language modeling and sentence compression tasks. We find that this approach is as fast as other efficient softmax approximations, while achieving comparable or even better performance relative to similar full softmax models.
翻译:我们为大型词汇库的神经-网络语言模型提出一种新的自组织等级软式软体模型。 我们的方法不是使用预先界定的等级结构,而是在语言模式培训过程中学习具有明确综合和语义含义的词组。 我们为语言模型和句子压缩任务的标准基准提供实验。 我们发现这种方法与其他高效软体-网络语言模型一样快速,同时取得与类似的完整软体模型相比的类似甚至更好的性能。