用于文本建模的分立自动递减变化注意模型 (Discrete Auto-regressive Variational Attention Models for Text Modeling)

Variational autoencoders (VAEs) have been widely applied for text modeling. In practice, however, they are troubled by two challenges: information underrepresentation and posterior collapse. The former arises as only the last hidden state of LSTM encoder is transformed into the latent space, which is generally insufficient to summarize the data. The latter is a long-standing problem during the training of VAEs as the optimization is trapped to a disastrous local optimum. In this paper, we propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges. Specifically, we introduce an auto-regressive variational attention approach to enrich the latent space by effectively capturing the semantic dependency from the input. We further design discrete latent space for the variational attention and mathematically show that our model is free from posterior collapse. Extensive experiments on language modeling tasks demonstrate the superiority of DAVAM against several VAE counterparts.

翻译：不同的自动编码器(VAEs)被广泛应用于文本模型,但在实践中,它们受到两个挑战的困扰:信息代表不足和后方崩溃,前者仅是LSTM编码器最后隐藏状态转换为潜在空间,一般不足以概括数据,后者在VAEs培训过程中是一个长期存在的问题,因为优化被困在灾难性的当地最佳状态中。在本文中,我们提议采用分立的自动递减注意模型(DAVAM)来应对挑战。具体地说,我们采用了一种自动递增式注意法,通过有效捕捉输入的语义依赖性来丰富潜在空间。我们进一步设计了分散的潜在空间,用于变化注意和数学显示我们的模型没有后方崩溃。关于语言模型任务的广泛实验表明DAVAM对若干VAE对应方的优势。