Variational Auto-Encoder (VAE) has become the de-facto learning paradigm in achieving representation learning and generation for natural language at the same time. Nevertheless, existing VAE-based language models either employ elementary RNNs, which is not powerful to handle complex works in the multi-task situation, or fine-tunes two pre-trained language models (PLMs) for any downstream task, which is a huge drain on resources. In this paper, we propose the first VAE framework empowered with adaptive GPT-2s (AdaVAE). Different from existing systems, we unify both the encoder\&decoder of the VAE model using GPT-2s with adaptive parameter-efficient components, and further introduce Latent Attention operation to better construct latent space from transformer models. Experiments from multiple dimensions validate that AdaVAE is competent to effectively organize language in three related tasks (language modeling, representation modeling and guided text generation) even with less than $15\%$ activated parameters in training. Our code is available at \url{https://github.com/ImKeTT/AdaVAE}.
翻译:然而,基于VAE的现有语言模式要么采用初级RNN(在多任务情况下处理复杂工作的能力不大),要么微调两种经过培训的、任何下游任务所需的预先语言模式(PLM),这是资源的巨大消耗。在本文件中,我们提议第一个VAE框架(AdaVAE)具有适应性GPT-2(AdaVAE)的授权。与现有系统不同,我们统一了使用GPT-2(GPT-2)和适应性参数效率组件的VAE模型的encoder(encoder),并进一步引入了远程注意操作,以更好地从变异模型中建造潜在空间。从多个层面进行的实验证实ADAVAE有能力在三种相关任务(语言模型、代表模式和导引文本生成)中有效地组织语言,即使培训中的激活参数低于15,000美元。我们的代码可在以下https://github.com/ImKTT/AVA}上查阅/url{https://gitub.com/IKTT/VAA}。