We survey Natural Language Processing (NLP) approaches to summarizing, simplifying, and generating patents' text. While solving these tasks has important practical applications - given patents' centrality in the R&D process - patents' idiosyncrasies open peculiar challenges to the current NLP state of the art. This survey aims at a) describing patents' characteristics and the questions they raise to the current NLP systems, b) critically presenting previous work and its evolution, and c) drawing attention to directions of research in which further work is needed. To the best of our knowledge, this is the first survey of generative approaches in the patent domain.
翻译:我们调查自然语言处理(NLP)的总结、简化和产生专利文本的方法。解决这些任务有重要的实际应用——鉴于专利在研究与开发过程中的中心地位,专利的特性对目前NLP的工艺状态提出了特殊的挑战。这项调查的目的是:(a) 描述专利的特点及其对目前的NLP系统提出的问题;(b) 批判性地介绍以前的工作及其演变情况;(c) 提请注意需要开展进一步工作的研究方向。根据我们的知识,这是对专利领域基因化方法的第一次调查。