In many jurisdictions, the excessive workload of courts leads to high delays. Suitable predictive AI models can assist legal professionals in their work, and thus enhance and speed up the process. So far, Legal Judgment Prediction (LJP) datasets have been released in English, French, and Chinese. We publicly release a multilingual (German, French, and Italian), diachronic (2000-2020) corpus of 85K cases from the Federal Supreme Court of Switzerland (FSCS). We evaluate state-of-the-art BERT-based methods including two variants of BERT that overcome the BERT input (text) length limitation (up to 512 tokens). Hierarchical BERT has the best performance (approx. 68-70% Macro-F1-Score in German and French). Furthermore, we study how several factors (canton of origin, year of publication, text length, legal area) affect performance. We release both the benchmark dataset and our code to accelerate future research and ensure reproducibility.
翻译:在许多司法管辖区,法院工作量过大导致严重延误。适当的预测性AI模型可以帮助法律专业人员开展工作,从而提升和加快这一进程。到目前为止,法律判决预测(LJP)数据集已经以英文、法文和中文发布。我们公开发布了瑞士联邦最高法院(FSCS)的85K多语种(德文、法文和意大利文)和diachronic(2000-2020年)案件汇编。我们评估了基于最新技术的BERT方法,包括克服BERT投入(文本)长度限制的两个变式(最多512个符号 ) 。高等级的BERT具有最佳性能(德文和法文本为68%-70% 宏观-F1-核心 ) 。此外,我们研究了若干因素(来源地、出版年份、文本长度、法律领域)如何影响业绩。我们发布了基准数据集和我们的代码,以加快未来研究并确保可再生。