This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task. Rather than the probabilistic outputs common to neural machine translation models, our approach employs a regression-based output layer to reconstruct the input sequence's word vectors. The model achieves high accuracy and fast training with the ADAM optimizer, a significant finding given that RNNs typically require memory units, such as LSTMs, or second-order optimization methods. We incorporate residual connections and introduce a "match drop" technique, where gradients are calculated only for incorrect words. Our approach demonstrates potential for various natural language processing applications, particularly in neural network-based systems that require high-quality sentence embeddings.
翻译:本研究提出了一种新型模型,使用残差循环网络进行无监督编码任务的训练,从而实现可逆句子嵌入。与神经机器翻译模型中常见的概率输出不同,我们的方法采用基于回归的输出层对输入序列的词向量进行重构。该模型在用ADAM优化器进行快速训练时,取得了高准确率的成功,这是令人瞩目的,因为RNN通常需要记忆单元(如LSTMs)或二阶优化方法。我们引入了残差连接,并介绍了一种“匹配丢弃”技术,其中仅针对错误单词计算梯度。我们的方法表明,在需要高质量句子嵌入的神经网络系统中,具有各种自然语言处理应用的潜力。