Recently, discrete latent variable models have received a surge of interest in both Natural Language Processing (NLP) and Computer Vision (CV), attributed to their comparable performance to the continuous counterparts in representation learning, while being more interpretable in their predictions. In this paper, we develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization. Compared with previous models limited to local semantic contexts, our model can explore richer semantic information via topic modeling. We further boost the performance of semantic similarity by injecting the quantized representation into a transformer-based language model with a well-designed semantic-driven attention mechanism. We demonstrate, through extensive experiments across various English language datasets, that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.
翻译:最近,离散潜伏变量模型在自然语言处理(NLP)和计算机视野(CV)两方面都引起了人们的极大兴趣,因为它们在代表性学习方面的表现与连续的对应方相当,同时在预测中更易于解释。在本文中,我们开发了一个具有专题知识的离散潜在变量模型,用于语义文本相似性,通过矢量量化为句子表达学习共同的潜在空间。与以往的模型相比,我们的模式可以通过主题模型探索更丰富的语义信息。我们进一步提升了语义相似性的表现,将四分制代表制注入一个基于变异语言的模型,并有一个设计完善的语义驱动关注机制。我们通过各种英语数据集的广泛实验,证明我们的模型在语义文本相似性任务中能够超过几个强大的神经基线。