Deep neural networks based on layer-stacking architectures have historically suffered from poor inherent interpretability. Meanwhile, symbolic probabilistic models function with clear interpretability, but how to combine them with neural networks to enhance their performance remains to be explored. In this paper, we try to marry these two systems for text classification via a structured language model. We propose a Symbolic-Neural model that can learn to explicitly predict class labels of text spans from a constituency tree without requiring any access to span-level gold labels. As the structured language model learns to predict constituency trees in a self-supervised manner, only raw texts and sentence-level labels are required as training data, which makes it essentially a general constituent-level self-interpretable classification model. Our experiments demonstrate that our approach could achieve good prediction accuracy in downstream tasks. Meanwhile, the predicted span labels are consistent with human rationales to a certain degree.
翻译:基于层叠结构的深神经网络历史上一直受到内在可解释性差的困扰。 与此同时,象征性的概率模型功能具有清晰的可解释性,但如何将它们与神经网络结合起来以提高其性能仍有待探索。 在本文中,我们试图通过结构化语言模型将这两个系统结合为文本分类系统。我们提议了一个符号-神经模型,可以学习明确预测选区树的文本等级标签,而无需使用跨层金标签。随着结构化语言模型学会以自我监督的方式预测选区树木,只需要原始文本和句级标签作为培训数据,这基本上使它成为一般的构成层面自我解释的分类模式。我们的实验表明,我们的方法可以在下游任务中实现良好的预测准确性。同时,预测的长度标签与人类的理由在一定程度上是一致的。</s>