Multi-label text classification (MLTC) is an attractive and challenging task in natural language processing (NLP). Compared with single-label text classification, MLTC has a wider range of applications in practice. In this paper, we propose a label-interpretable graph convolutional network model to solve the MLTC problem by modeling tokens and labels as nodes in a heterogeneous graph. In this way, we are able to take into account multiple relationships including token-level relationships. Besides, the model allows better interpretability for predicted labels as the token-label edges are exposed. We evaluate our method on four real-world datasets and it achieves competitive scores against selected baseline methods. Specifically, this model achieves a gain of 0.14 on the F1 score in the small label set MLTC, and 0.07 in the large label set scenario.
翻译:多标签文本分类(MLTC)在自然语言处理(NLP)中是一项有吸引力和具有挑战性的任务。与单一标签文本分类相比,MLTC在实际应用中具有更广泛的应用范围。在本文中,我们建议采用一个标签解释图解演变网络模型,通过将标牌和标签建模作为多元图中的节点来解决MLTC问题。这样,我们就能够考虑到多种关系,包括象征性级别关系。此外,该模型允许预测的标签有更好的解释性,因为象征性标签边缘被暴露了。我们评估了四个真实世界数据集的方法,并根据选定的基准方法取得了竞争性分数。具体地说,这一模型在小标签集MLTC的F1分上取得了0.14分,在大标签集情景中获得了0.07分。