In this work, we create a web application to highlight the output of NLP models trained to parse and label discourse segments in law text. Our system is built primarily with journalists and legal interpreters in mind, and we focus on state-level law that uses U.S. Census population numbers to allocate resources and organize government. Our system exposes a corpus we collect of 6,000 state-level laws that pertain to the U.S. census, using 25 scrapers we built to crawl state law websites, which we release. We also build a novel, flexible annotation framework that can handle span-tagging and relation tagging on an arbitrary input text document and be embedded simply into any webpage. This framework allows journalists and researchers to add to our annotation database by correcting and tagging new data.
翻译:在这项工作中,我们创建了一个网络应用程序,以突出在法律文本中分析和标注谈话部分的NLP模型的输出。我们的系统主要由记者和法律口译员组成,我们侧重于使用美国人口普查人口数字分配资源和组织政府的国家层面法律。我们的系统暴露了我们收集的6 000项与美国人口普查有关的州级法律的文集,我们用我们为爬行州法律网站而建造的25个剪贴板来发布。我们还建立了一个新颖、灵活的注释框架,可以处理横跨拖拉和连接问题,在任意输入文本文件上贴上标签,并直接嵌入任何网页。这个框架允许记者和研究人员通过更正和标记新数据,在我们的注解数据库中添加新数据。