Purpose: To stabilize the NLPContributionGraph scheme for the surface structuring of contributions information in Natural Language Processing (NLP) scholarly articles via a two-stage annotation methodology: first stage - to define the scheme; and second stage - to stabilize the graphing model. Approach: Re-annotate, a second time, the contributions-pertinent information across 50 prior-annotated NLP scholarly articles in terms of a data pipeline comprising: contribution-centered sentences, phrases, and triples. To this end specifically, care was taken in the second annotation stage to reduce annotation noise while formulating the guidelines for our proposed novel NLP contributions structuring scheme. Findings: The application of NLPContributionGraph on the 50 articles resulted in finally in a dataset of 900 contribution-focused sentences, 4,702 contribution-information-centered phrases, and 2,980 surface-structured triples. The intra-annotation agreement between the first and second stages, in terms of F1, was 67.92% for sentences, 41.82% for phrases, and 22.31% for triples indicating that with an increased granularity of the information, the annotation decision variance is greater. Practical Implications: Demonstrate NLPContributionGraph data integrated in the Open Research Knowledge Graph (ORKG), a next-generation KG-based digital library with compute enabled over structured scholarly knowledge, as a viable aid to assist researchers in their day-to-day tasks. Value: NLPContributionGraph is a novel scheme to obtain research contribution-centered graphs from NLP articles which to the best of our knowledge does not exist in the community. And our quantitative evaluations over the two-stage annotation tasks offer insights into task difficulty.
翻译:目的: 通过两阶段的注解方法,稳定用于自然语言处理(NLP)学术文章捐款信息表面结构的NLPC分配格格仪计划。 具体地说,在第二个注解阶段,通过两个阶段的注解阶段降低注解噪音,同时为我们拟议的新NLP捐款结构计划制定指导方针。 结果:在50篇文章中应用NLPC分配格仪最终导致以捐款为重点的句子、4 702个捐款-信息中心词句和2 980个地表结构三层数据管道数据管道。 第一阶段和第二阶段的注解协议,在F1中,用于减少注解噪音,同时为我们拟议的NLP捐款结构结构构建计划制定指导方针。 结果:在50篇文章中应用NLPC分配格调格仪,最终形成了以捐款为重点的900个数据集,4 702 702个捐款-信息中心-信息-信息中心学术结构三重。 第一阶段和第二阶段的内注解协议,在F1中存在67.92%的句号, N82%的注解调调调调调调调调调调调调, 在KGIL的调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调,,, 的调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调数据的调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调