Patent retrieval influences several applications within engineering design research, education, and practice as well as applications that concern innovation, intellectual property, and knowledge management etc. In this article, we propose a method to retrieve patents relevant to an initial set of patents, by synthesizing state-of-the-art techniques among natural language processing and knowledge graph embedding. Our method involves a patent embedding that captures text, citation, and inventor information, which individually represent different facets of knowledge communicated through a patent document. We obtain text embeddings using Sentence-BERT applied to titles and abstracts. We obtain citation and inventor embeddings through TransE that is trained using the corresponding knowledge graphs. We identify using a classification task that the concatenation of text, citation, and inventor embeddings offers a plausible representation of a patent. While the proposed patent embedding could be used to associate a pair of patents, we observe using a recall task that multiple initial patents could be associated with a target patent using mean cosine similarity, which could then be utilized to rank all target patents and retrieve the most relevant ones. We apply the proposed patent retrieval method to a set of patents corresponding to a product family and an inventor's portfolio.
翻译:在本条中,我们提出一种方法,通过综合自然语言处理和知识图嵌入的先进技术,在自然语言处理和知识图嵌入中合成与最初一套专利有关的先进技术,从而检索与最初一套专利有关的专利。我们的方法涉及一种专利嵌入,以捕捉文本、引证和发明者信息,这个别地代表了通过专利文件传播的知识的不同方面。我们获得了使用适用于标题和摘要的句子-BERT嵌入的文本。我们通过TransE获得了引证和发明嵌入,并用相应的知识图进行了培训。我们确定使用一种分类任务,即将文本、引证和发明嵌入的组合为一种专利。虽然拟议的专利嵌入中的专利可以用来将一对专利联系起来,但我们使用回顾的任务发现,多个初始专利可能与一个目标专利相联系,使用平均值相似性,然后用于将所有目标专利排入等级,并检索最相关的专利。我们用拟议的专利检索方法将一套专利组合用于与家庭产品和产品对应的专利。