从带有深度部分图表匹配的文档中提取的一张照片关键信息 (One-shot Key Information Extraction from Document with Deep Partial Graph Matching)

Automating the Key Information Extraction (KIE) from documents improves efficiency, productivity, and security in many industrial scenarios such as rapid indexing and archiving. Many existing supervised learning methods for the KIE task need to feed a large number of labeled samples and learn separate models for different types of documents. However, collecting and labeling a large dataset is time-consuming and is not a user-friendly requirement for many cloud platforms. To overcome these challenges, we propose a deep end-to-end trainable network for one-shot KIE using partial graph matching. Contrary to previous methods that the learning of similarity and solving are optimized separately, our method enables the learning of the two processes in an end-to-end framework. Existing one-shot KIE methods are either template or simple attention-based learning approach that struggle to handle texts that are shifted beyond their desired positions caused by printers, as illustrated in Fig.1. To solve this problem, we add one-to-(at most)-one constraint such that we will find the globally optimized solution even if some texts are drifted. Further, we design a multimodal context ensemble block to boost the performance through fusing features of spatial, textual, and aspect representations. To promote research of KIE, we collected and annotated a one-shot document KIE dataset named DKIE with diverse types of images. The DKIE dataset consists of 2.5K document images captured by mobile phones in natural scenes, and it is the largest available one-shot KIE dataset up to now. The results of experiments on DKIE show that our method achieved state-of-the-art performance compared with recent one-shot and supervised learning approaches. The dataset and proposed one-shot KIE model will be released soo

翻译：从文档中自动生成关键信息提取( KIE ), 提高了效率、生产率和安全性。许多现有的 KIE 任务监管下的学习方法需要为大量标签样本提供食物, 并学习不同类型文档的不同模型。然而, 收集和标签大型数据集需要时间, 并不是许多云层平台的方便用户的要求。为了克服这些挑战, 我们建议使用部分图形匹配来为一发 KIE 提供一个深端到终端的可培训网络。与以前的方法相反, 类似性和解决方案的学习是分别优化的。许多现有的 KIE 任务受监督的学习方法需要为大量标签样本提供食物, 并且为不同类型文档的匹配方法。我们设计了一个多式背景的 Kenemble 图像在端到端框架中学习两个进程。现有的一发式 KIE 方法, 将用来通过打印机移动到您想要的位置, 将您所收集的文本转换成一个图像, KKIE 和 KKKIE 的图像的图像格式, 将显示一个图像的运行状况, 将显示我们所收集的图像的自然数据格式。

相关内容

信息抽取

关注 350

信息抽取（Information Extraction: IE）是把文本里包含的信息进行结构化处理，变成表格一样的组织形式。输入信息抽取系统的是原始文本，输出的是固定格式的信息点。信息点从各种各样的文档中被抽取出来，然后以统一的形式集成在一起。这就是信息抽取的主要任务。信息以统一的形式集成在一起的好处是方便检查和比较。信息抽取技术并不试图全面理解整篇文档，只是对文档中包含相关信息的部分进行分析。至于哪些信息是相关的，那将由系统设计时定下的领域范围而定。

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

【SIGIR2021】ScaleFreeCTR：超大规模Embedding推荐模型分布式训练系统

专知会员服务

28+阅读 · 2021年4月26日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【KDD2020】图模型信息融合

专知会员服务

39+阅读 · 2020年10月15日