The massive growth of digital biomedical data is making biomedical text indexing and classification increasingly important. Accordingly, previous research has devised numerous deep learning techniques focused on using feedforward, convolutional or recurrent neural architectures. More recently, fine-tuned transformers-based pretrained models (PTMs) have demonstrated superior performance compared to such models in many natural language processing tasks. However, the direct use of PTMs in the biomedical domain is only limited to the target documents, ignoring the rich semantic information in the label descriptions. In this paper, we develop an improved label attention-based architecture to inject semantic label description into the fine-tuning process of PTMs. Results on two public medical datasets show that the proposed fine-tuning scheme outperforms the conventionally fine-tuned PTMs and prior state-of-the-art models. Furthermore, we show that fine-tuning with the label attention mechanism is interpretable in the interpretability study.
翻译:数字生物医学数据的巨大增长使生物医学文本索引和分类变得日益重要。 因此,先前的研究已经发明了许多深层次的学习技术,重点是使用进料、进量或经常神经结构。最近,在许多自然语言处理任务中,基于改良变压器的预培训模型(PTMs)比这类模型表现优于许多自然语言处理任务。然而,生物医学领域直接使用PTMs仅限于目标文件,忽视了标签描述中丰富的语义信息。在本文中,我们开发了一个改进的标签关注型结构,将语义标签描述输入PTMs的微调过程。两个公共医疗数据集的结果表明,拟议的微调计划优于常规微调的PTMs和以前最先进的模型。此外,我们表明,在可解释性研究中,可以对标签关注机制进行微调。