We aim to highlight an interesting trend to contribute to the ongoing debate around advances within legal Natural Language Processing. Recently, the focus for most legal text classification tasks has shifted towards large pre-trained deep learning models such as BERT. In this paper, we show that a more traditional approach based on Support Vector Machine classifiers reaches competitive performance with deep learning models. We also highlight that error reduction obtained by using specialised BERT-based models over baselines is noticeably smaller in the legal domain when compared to general language tasks. We discuss some hypotheses for these results to support future discussions.
翻译:我们的目标是强调一种有趣的趋势,为围绕法律自然语言处理内部的进展正在进行的辩论作出贡献。最近,大多数法律文本分类任务的重点已经转向大型的事先经过训练的深层学习模式,如BERT。在本文中,我们表明,基于支助病媒机级的较传统方法以深层次学习模式达到竞争性业绩。我们还强调,在基线方面使用专门的BERT模型减少错误在法律领域比一般语言任务明显要小。我们讨论了这些结果的一些假设,以支持今后的讨论。