韩方法律语言理解和判断预测的多任务基准 (A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction)

The recent advances of deep learning have dramatically changed how machine learning, especially in the domain of natural language processing, can be applied to legal domain. However, this shift to the data-driven approaches calls for larger and more diverse datasets, which are nevertheless still small in number, especially in non-English languages. Here we present the first large-scale benchmark of Korean legal AI datasets, LBOX OPEN, that consists of one legal corpus, two classification tasks, two legal judgement prediction (LJP) tasks, and one summarization task. The legal corpus consists of 147k Korean precedents (259M tokens), of which 63k are sentenced in last 4 years and 96k are from the first and the second level courts in which factual issues are reviewed. The two classification tasks are case names (11.3k) and statutes (2.8k) prediction from the factual description of individual cases. The LJP tasks consist of (1) 10.5k criminal examples where the model is asked to predict fine amount, imprisonment with labor, and imprisonment without labor ranges for the given facts, and (2) 4.7k civil examples where the inputs are facts and claim for relief and outputs are the degrees of claim acceptance. The summarization task consists of the Supreme Court precedents and the corresponding summaries (20k). We also release realistic variants of the datasets by extending the domain (1) to infrequent case categories in case name (31k examples) and statute (17.7k) classification tasks, and (2) to long input sequences in the summarization task (51k). Finally, we release LCUBE, the first Korean legal language model trained on the legal corpus from this study. Given the uniqueness of the Law of South Korea and the diversity of the legal tasks covered in this work, we believe that LBOX OPEN contributes to the multilinguality of global legal research. LBOX OPEN and LCUBE will be publicly available.

翻译：最近深层次学习的进展大大改变了机器学习,特别是在自然语言处理领域的机器学习如何适用于法律领域。然而,这种转向数据驱动方法的转变需要更多、更多样化的数据集,但数量仍然很少,特别是非英语的数据集。这里我们展示了韩国法律独立数据集(LBOXFreaten)的第一个大规模基准,包括一个法律文件、两个分类任务、两个法律判决预测任务和一项总结任务。法律系统由147k韩国先例(259M纪念品)组成,其中63k在过去4年中被判刑,96k来自第一和第二级法院,对事实问题进行审查。两个分类任务包括案件名称(11.3k)和法规(2.8k)对个别案件的事实描述所作的预测。LBOXFreaty任务包括:(1) 要求模型预测微量、有劳动的监禁和没有劳动范围监禁的刑事实例。(2) 法律信息化实例,其中,在过去4年中,63k判刑,96k来自第一和第二级法院,LBOxxxxx的法律规定是法律解释性案例的等级。我们从法律解释性研究到最终的等级任务,我们相信韩国法律解释性任务。