印度法律案文摘要:基于案文的规范化方法 (Indian Legal Text Summarization: A Text Normalisation-based Approach)

In the Indian court system, pending cases have long been a problem. There are more than 4 crore cases outstanding. Manually summarising hundreds of documents is a time-consuming and tedious task for legal stakeholders. Many state-of-the-art models for text summarization have emerged as machine learning has progressed. Domain-independent models don't do well with legal texts, and fine-tuning those models for the Indian Legal System is problematic due to a lack of publicly available datasets. To improve the performance of domain-independent models, the authors have proposed a methodology for normalising legal texts in the Indian context. The authors experimented with two state-of-the-art domain-independent models for legal text summarization, namely BART and PEGASUS. BART and PEGASUS are put through their paces in terms of extractive and abstractive summarization to understand the effectiveness of the text normalisation approach. Summarised texts are evaluated by domain experts on multiple parameters and using ROUGE metrics. It shows the proposed text normalisation approach is effective in legal texts with domain-independent models.

翻译：在印度法院系统中,待决案件长期以来一直是一个问题,有4个以上未决案件,有4个以上未决案件。对法律利益攸关方来说,人工总结数百份文件是一项耗时和繁琐的任务。随着机器学习的进展,出现了许多最先进的文本总结模型。自成一体的模式在法律文本方面不尽如人意,对印度法律制度的这些模型进行微调是有问题的,因为缺乏公开的数据集。为改善独立域模型的性能,作者们提出了一个在印度情况下使法律文本正常化的方法。作者们试用了两种最先进的、最先进的域独立的法律文本总结模型,即BART和PEGASUS。BART和PEGASUS的步调是按其采掘和抽象总结速度来理解文本正常化方法的有效性的。根据多种参数和使用ROUGE的衡量标准,对摘要文本进行了评估。它表明拟议的文本正常化方法在法律文本中与自成一体的模式是有效的。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/