Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.
翻译:序列序列( 如机器翻译) 的句子差异性各异 。 具有基本不同含义或语法结构的句子可能会在培训网络时增加趋同难度 。 在本文中, 我们引入了一种模型, 解决序列序列任务序列中的异质性 。 多过滤器 Gaussian Mixture Autoencoder (MGMAE) 使用一个自动编码器来了解输入的表达方式 。 表示方式是来自编码器的输出, 位于潜伏空间, 其维度为编码器隐藏维度的隐蔽空间 。 隐蔽空间中的培训数据的表达方式可能会用来训练高斯混合物 。 潜在的空间表示方式被分为高斯分布的几种混合物 。 一个过滤器( decoder) 将数据匹配到高斯分布中的一种数据 。 每个高斯解码器都对应一个过滤器, 以便过滤器负责本高斯语系内部的变异性。 因此, 培训数据的异性能性能性能性能可以用来训练高斯混合 。 比较实验将我们网络的变换为 的顺序 。 将数据转换为比较系统 。