Legal documents are typically long and written in legalese, which makes it particularly difficult for laypeople to understand their rights and duties. While natural language understanding technologies can be valuable in supporting such understanding in the legal domain, the limited availability of datasets annotated for deontic modalities in the legal domain, due to the cost of hiring experts and privacy issues, is a bottleneck. To this end, we introduce, LEXDEMOD, a corpus of English contracts annotated with deontic modality expressed with respect to a contracting party or agent along with the modal triggers. We benchmark this dataset on two tasks: (i) agent-specific multi-label deontic modality classification, and (ii) agent-specific deontic modality and trigger span detection using Transformer-based (Vaswani et al., 2017) language models. Transfer learning experiments show that the linguistic diversity of modal expressions in LEXDEMOD generalizes reasonably from lease to employment and rental agreements. A small case study indicates that a model trained on LEXDEMOD can detect red flags with high recall. We believe our work offers a new research direction for deontic modality detection in the legal domain.
翻译:法律文件通常是长长的,用法律文字写成,使得普通人特别难以理解其权利和义务。自然语言理解技术对于支持法律领域的这种理解可能很有价值,但由于聘用专家和隐私问题的费用,法律领域对离谱模式附加说明的数据集有限,这是一个瓶颈。为此,我们提出,LEXDEMOD(LEXDEMOD)的一整套英文合同,带有与模式触发器一起表达的与合同方或代理人有关的离解模式有关的附加说明的英文合同。我们把这一数据集以两项任务作为基准:(一) 特定代理人的多标签登盘模式分类,和(二) 特定代理人的离子模式,以及利用基于变换器(Vaswani等人,2017年)的语言模式进行触发检测。转让学习实验表明,LEXDDEMOD(LEXDID)中模式表达方式的多样性从租赁到雇用和租赁协议一般合理化。一个小的案例研究表明,在LEXDDEDDD(LEXDID)培训的模型可以高记红旗。我们认为,我们的工作为法律领域的脱钩模式提供了新的研究方向。