Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural networks to populate a relational database with facts about each document-image; and (b) we use a form of deductive reasoning, related to meta-interpretive learning of transition systems to learn extraction programs: Given task-specific transitions defined using the entities and relations identified by the neural detectors and a small number of instances (usually 1, sometimes 2) of images and the desired outputs, a resource-bounded meta-interpreter constructs proofs for the instance(s) via logical deduction; a set of logic programs that extract each desired entity is easily synthesized from such proofs. In most cases a single training example together with a noisy-clone of itself suffices to learn a program-set that generalizes well on test documents, at which time the value of each entity is determined by a majority vote across its program-set. We demonstrate our two-level neuro-deductive approach on publicly available datasets ("Patent" and "Doctor's Bills") and also describe its use in a real-life industrial problem.
翻译:我们对本文的兴趣在于满足了工业对从发票、账单、收据等文件图像中提取信息的迅速增长的工业需求。在实践中,用户能够提供数量很少的带有需要提取的信息的示例图像。我们采用了一种新型的双层神经诱导性方法,即(a) 我们使用经过预先训练的深层神经网络,用每个文件图像的事实来填充关系数据库;以及(b) 我们使用一种推理,即与过渡系统元解释性学习有关的推理,以学习提取程序:根据神经探测器查明的实体和关系以及少量图像和预期产出(通常为1个,有时为2个)的情况来界定任务特定过渡。我们采用了一种资源限制的元解释方法,通过逻辑推算来为实例建立证据;一套逻辑程序,从这些证据中可以很容易地综合出每个理想实体;以及(b)我们使用一个单一的培训实例,同时用一个杂音组合来学习一个程序设置,在测试文件上非常概括地标定出一个程序,在两个测试文件上显示我们每个实体的多数使用时间。