This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task. Rather than the probabilistic outputs common to neural machine translation models, our approach employs a regression-based output layer to reconstruct the input sequence's word vectors. The model achieves high accuracy and fast training with the ADAM optimizer, a significant finding given that RNNs typically require memory units, such as LSTMs, or second-order optimization methods. We incorporate residual connections and introduce a "match drop" technique, where gradients are calculated only for incorrect words. Our approach demonstrates potential for various natural language processing applications, particularly in neural network-based systems that require high-quality sentence embeddings.
翻译:本研究提出了一种新的模型,使用残差循环网络进行无监督编码任务的训练,实现可逆句子嵌入。与神经机器翻译模型常见的概率输出不同,我们的方法采用基于回归的输出层来重建输入序列的词向量。该模型在ADAM优化器下实现高准确性和快速训练,在RNN通常需要LSTMs等存储单元或二阶优化方法的情况下,这是一个重要的发现。我们引入残差连接,并介绍了一种“匹配丢弃”技术,仅计算错误单词的梯度。我们的方法在各种自然语言处理应用中具有潜力,特别是在需要高质量句子嵌入的神经网络系统中。