Hierarchical text classification aims to leverage label hierarchy in multi-label text classification. Existing methods encode label hierarchy in a global view, where label hierarchy is treated as the static hierarchical structure containing all labels. Since global hierarchy is static and irrelevant to text samples, it makes these methods hard to exploit hierarchical information. Contrary to global hierarchy, local hierarchy as a structured labels hierarchy corresponding to each text sample. It is dynamic and relevant to text samples, which is ignored in previous methods. To exploit global and local hierarchies,we propose Hierarchy-guided BERT with Global and Local hierarchies (HBGL), which utilizes the large-scale parameters and prior language knowledge of BERT to model both global and local hierarchies.Moreover,HBGL avoids the intentional fusion of semantic and hierarchical modules by directly modeling semantic and hierarchical information with BERT.Compared with the state-of-the-art method HGCLR,our method achieves significant improvement on three benchmark datasets.
翻译:在多标签文本分类中,现有方法将标签等级编码为包含所有标签的静态等级结构。由于全球等级结构是静态的,与文本样本无关,因此很难利用这些方法来利用等级信息。与全球等级结构相反,地方等级结构是每个文本样本对应的结构标签等级。它与文本样本相关,具有动态和相关性,以往的方法对此不予考虑。为了利用全球和地方等级结构,我们提议用全球和地方等级结构(HBGL)来规范标签等级结构(HBGL),利用BERT的大规模参数和先前语言知识来模拟全球和地方等级结构。Moreover,HBGL避免了与BERT直接建模的语义和等级模块的故意融合。我们的方法与最新技术的HGCLR方法比较,在三个基准数据集上取得了显著的改进。