Extreme multi-label text classification utilizes the label hierarchy to partition extreme labels into multiple label groups, turning the task into simple multi-group multi-label classification tasks. Current research encodes labels as a vector with fixed length which needs establish multiple classifiers for different label groups. The problem is how to build only one classifier without sacrificing the label relationship in the hierarchy. This paper adopts the multi-answer questioning task for extreme multi-label classification. This paper also proposes an auxiliary classification evaluation metric. This study adopts the proposed method and the evaluation metric to the legal domain. The utilization of legal Berts and the study on task distribution are discussed. The experiment results show that the proposed hierarchy and multi-answer questioning task can do extreme multi-label classification for EURLEX dataset. And in minor/fine-tuning the multi-label classification task, the domain adapted BERT models could not show apparent advantages in this experiment. The method is also theoretically applicable to zero-shot learning.
翻译:极端多标签文本分类利用标签等级将极端标签分成多个标签组,将任务转化为简单的多组多标签分类任务。当前研究将标签编码为固定长度的矢量,需要为不同标签组建立多个分类器。问题是如何在不牺牲等级组的标签关系的情况下只建立一个分类器。本文采用极端多标签分类的多答问任务。本文还提出一个辅助分类评价指标。本研究将拟议的方法和评价指标应用于法律领域。讨论了法律Berts和任务分布研究的利用情况。实验结果显示,拟议的等级和多答案质询任务可以为EURLEX数据集做极端的多标签分类分类。在微调多标签分类任务时,经调整的域 BERT 模型不能显示此实验的明显优势。该方法在理论上也适用于零光学习。</s>