Models based on large-pretrained language models, such as S(entence)BERT, provide effective and efficient sentence embeddings that show high correlation to human similarity ratings, but lack interpretability. On the other hand, graph metrics for graph-based meaning representations (e.g., Abstract Meaning Representation, AMR) can make explicit the semantic aspects in which two sentences are similar. However, such metrics tend to be slow, rely on parsers, and do not reach state-of-the-art performance when rating sentence similarity. In this work, we aim at the best of both worlds, by learning to induce $S$emantically $S$tructured $S$entence BERT embeddings (S$^3$BERT). Our S$^3$BERT embeddings are composed of explainable sub-embeddings that emphasize various semantic sentence features (e.g., semantic roles, negation, or quantification). We show how to i) learn a decomposition of the sentence embeddings into semantic features, through approximation of a suite of interpretable AMR graph metrics, and how to ii) preserve the overall power of the neural embeddings by controlling the decomposition learning process with a second objective that enforces consistency with the similarity ratings of an SBERT teacher model. In our experimental studies, we show that our approach offers interpretability -- while fully preserving the effectiveness and efficiency of the neural sentence embeddings.
翻译:以大规模受精语言模式为基础的模型,如S(entence)BERT, 提供了有效和高效的句子嵌入,显示与人类相似评级高度相关,但缺乏可解释性。另一方面,基于图表的含义表达(例如,抽象含义代表、AMR)的图形度量可以明确两个句子相似的语义方面。然而,这类度量往往缓慢,依赖分析者,在评级判决相似时不能达到最先进的性能。在这项工作中,我们的目标是让两个世界都达到最佳的句子嵌入,通过学习以美元的形式诱导以美元为模型与人类相似评级,但以美元为结构的美元结构,BERT嵌入(S=3$BERT)。我们的S$3BERT嵌入由可解释的子集组成,强调各种语义的特征(例如,语义作用,否定性,或量化性) 。我们展示了如何将句子分离的句子嵌入模式嵌入二的语义性特征,以美元为美元结构结构,同时通过一个精确的缩略图的缩缩缩略图,以保持整个的缩缩缩缩图,显示学习过程。