用于未受监督的有条件文本生成的包包导体自动编码器 (Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation)

Text autoencoders are often used for unsupervised conditional text generation by applying mappings in the latent space to change attributes to the desired values. Recently, Mai et al. (2020) proposed Emb2Emb, a method to learn these mappings in the embedding space of an autoencoder. However, their method is restricted to autoencoders with a single-vector embedding, which limits how much information can be retained. We address this issue by extending their method to Bag-of-Vectors Autoencoders (BoV-AEs), which encode the text into a variable-size bag of vectors that grows with the size of the text, as in attention-based models. This allows to encode and reconstruct much longer texts than standard autoencoders. Analogous to conventional autoencoders, we propose regularization techniques that facilitate learning meaningful operations in the latent space. Finally, we adapt Emb2Emb for a training scheme that learns to map an input bag to an output bag, including a novel loss function and neural architecture. Our empirical evaluations on unsupervised sentiment transfer show that our method performs substantially better than a standard autoencoder.

翻译：文本自动编码器通常用于无监督的有条件文本生成,方法是在暗层中进行绘图,以改变期望值的属性。最近, Mai 等人(202020年)提出了Emb2Emb,这是在自动编码器嵌入空间中学习这些映射的方法。然而,它们的方法仅限于单向嵌入器的自动编码器,这限制了可以保留多少信息。我们通过将其方法扩展至导体自动编码器(BoV-AEs),将文字编码成随文本大小而增长的可变尺寸矢量袋,如同关注模型一样。这样可以对比标准的自动编码器更长的文本进行编码和重新构建。对常规自动编码器进行分析,我们提出规范化技术,以便利学习在潜层空间中有意义的操作。最后,我们将 Emb2Emb用于一个培训方案,该培训方案可以学习将输入袋映射成输出袋,包括新式损失函数和内线结构。我们对非超导式感应感官感官感官感官感官的感官感官感应比我们的方法要好得多。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日