Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.
翻译:短文本分类是自然语言处理(NLP)的重要任务之一。 与段落或文件不同,短文本更为模糊,因为它们没有足够的背景信息,对分类构成巨大挑战。在本文中,我们从外部知识来源获取知识,以加强短文本的语义表达方式。我们把概念信息作为一种知识,并将其纳入深层神经网络。为了衡量知识的重要性,我们引入关注机制,并提议有知识关注的深度短文本分类(STCKA)。我们利用短文本概念关注和概念集概念集的概念,从两个方面获得概念集的份量。我们用概念信息对短文本进行分类。与传统方法不同,我们将短文本分类,我们的行为模式如人类具有内在能力,能够根据观察(即机器培训数据)作出决定,并更多地关注重要知识。我们还对四个公共数据集进行了广泛的实验,用于不同任务。实验结果和案例研究表明,我们的模型不符合最新方法,说明知识引人注意的有效性。