Because of the inevitable cost and complexity of transformer and pre-trained models, efficiency concerns are raised for long text classification. Meanwhile, in the highly sensitive domains, e.g., healthcare and legal long-text mining, potential model distrust, yet underrated and underexplored, may hatch vital apprehension. Existing methods generally segment the long text, encode each piece with the pre-trained model, and use attention or RNNs to obtain long text representation for classification. In this work, we propose a simple but effective model, Segment-aWare multIdimensional PErceptron (SWIPE), to replace attention/RNNs in the above framework. Unlike prior efforts, SWIPE can effectively learn the label of the entire text with supervised training, while perceive the labels of the segments and estimate their contributions to the long-text labeling in an unsupervised manner. As a general classifier, SWIPE can endorse different encoders, and it outperforms SOTA models in terms of classification accuracy and model efficiency. It is noteworthy that SWIPE achieves superior interpretability to transparentize long text classification results.
翻译:由于Transformer和预训练模型不可避免的成本和复杂性,长文本分类的效率问题备受关注。与此同时,在高度敏感的领域,例如医疗和法律长文本挖掘中,潜在的模型不信任感被低估和未充分探索,这可能引发重要的担忧。现有的方法通常将长文本分段,使用预训练模型对每个片段进行编码,并使用注意力或RNN来获取长文本的表示以进行分类。在这项工作中,我们提出了一种简单但有效的模型,称为Segment-aWare 多维感知机(SWIPE),用于替代上述框架中的注意力/RNN。与先前的工作不同,SWIPE可以通过监督训练有效地学习整个文本的标签,同时可以以无监督的方式感知片段的标签并估计它们对长文本标签的贡献。作为一种通用分类器,SWIPE可以支持不同的编码器,并且在分类准确性和模型效率方面优于SOTA模型。值得注意的是,SWIPE实现了优越的可解释性以透明化长文本分类结果。