Identifying, classifying, and analyzing arguments in legal discourse has been a prominent area of research since the inception of the argument mining field. However, there has been a major discrepancy between the way natural language processing (NLP) researchers model and annotate arguments in court decisions and the way legal experts understand and analyze legal argumentation. While computational approaches typically simplify arguments into generic premises and claims, arguments in legal research usually exhibit a rich typology that is important for gaining insights into the particular case and applications of law in general. We address this problem and make several substantial contributions to move the field forward. First, we design a new annotation scheme for legal arguments in proceedings of the European Court of Human Rights (ECHR) that is deeply rooted in the theory and practice of legal argumentation research. Second, we compile and annotate a large corpus of 373 court decisions (2.3M tokens and 15k annotated argument spans). Finally, we train an argument mining model that outperforms state-of-the-art models in the legal NLP domain and provide a thorough expert-based evaluation. All datasets and source codes are available under open lincenses at https://github.com/trusthlt/mining-legal-arguments.
翻译:自理论采矿领域开始以来,在法律论述中,辨别、分类和分析论据一直是一个突出的研究领域,然而,自然语言处理(NLP)研究人员模式和法院裁决中的批注性论点与法律专家理解和分析法律论据的方式之间存在着重大差异。虽然计算方法通常将论据简化为通用前提和主张,但法律研究中的论点通常显示出丰富的类型学,对于深入了解特定案例和一般法律应用非常重要。我们处理这一问题,并为推进该领域作出了一些重大贡献。首先,我们为欧洲人权法院(ECHR)诉讼程序中的法律论据设计了新的批注办法,深深植根于法律论证研究的理论和实践。第二,我们汇编和批注了373项法院裁决(2.3M证和15k附加说明性论点的范围)。最后,我们培训了一种比法律文献领域中的最新模式更形的论据采矿模型,并提供彻底的专家评价。所有数据集和源代码都可在开放的Lustrus-rusturgismismus/ httpgligustralsmiss/ httpsurgustralminalminis)。