It has been long known that sparsity is an effective inductive bias for learning efficient representation of data in vectors with fixed dimensionality, and it has been explored in many areas of representation learning. Of particular interest to this work is the investigation of the sparsity within the VAE framework which has been explored a lot in the image domain, but has been lacking even a basic level of exploration in NLP. Additionally, NLP is also lagging behind in terms of learning sparse representations of large units of text e.g., sentences. We use the VAEs that induce sparse latent representations of large units of text to address the aforementioned shortcomings. First, we move in this direction by measuring the success of unsupervised state-of-the-art (SOTA) and other strong VAE-based sparsification baselines for text and propose a hierarchical sparse VAE model to address the stability issue of SOTA. Then, we look at the implications of sparsity on text classification across 3 datasets, and highlight a link between performance of sparse latent representations on downstream tasks and its ability to encode task-related information.
翻译:长期以来人们一直知道,宽度是学习在具有固定维度的矢量中有效表述数据的有效诱导偏差,在代表性学习的许多领域对此进行了探讨。这项工作特别感兴趣的是调查VAE框架内的宽度,在图像领域对此进行了大量探讨,但在NLP中甚至缺乏基本的探索水平。此外,NLP在学习大量文字单位(例如,句子)的偏差方面也落后了。我们利用导致大量文字单位的隐性表达的VAE来纠正上述缺陷。首先,我们朝这个方向前进,衡量未受监督的状态(SOTA)和其他基于VAE的强健的广度基线在文本上的成功程度,并提出一个等级分散的VAE模型,以解决SOTA的稳定问题。然后,我们审视3个数据集对文本分类的松散性影响,并突出下游任务隐性表述工作的表现及其为任务编码任务相关信息的能力之间的联系。