Documents are central to many business systems, and include forms, reports, contracts, invoices or purchase orders. The information in documents is typically in natural language, but can be organized in various layouts and formats. There have been recent spurt of interest in understanding document content with novel deep learning architectures. However, document understanding tasks need dense information annotations, which are costly to scale and generalize. Several active learning techniques have been proposed to reduce the overall budget of annotation while maintaining the performance of the underlying deep learning model. However, most of these techniques work only for classification problems. But content detection is a more complex task, and has been scarcely explored in active learning literature. In this paper, we propose \textit{OPAD}, a novel framework using reinforcement policy for active learning in content detection tasks for documents. The proposed framework learns the acquisition function to decide the samples to be selected while optimizing performance metrics that the tasks typically have. Furthermore, we extend to weak labelling scenarios to further reduce the cost of annotation significantly. We propose novel rewards to account for class imbalance and user feedback in the annotation interface, to improve the active learning method. We show superior performance of the proposed \textit{OPAD} framework for active learning for various tasks related to document understanding like layout parsing, object detection and named entity recognition. Ablation studies for human feedback and class imbalance rewards are presented, along with a comparison of annotation times for different approaches.
翻译:对许多商业系统而言,文件是核心,包括表格、报告、合同、发票或定购单。文件中的信息通常使用自然语言,但可以以不同的布局和格式编排。最近人们对理解文件内容有浓厚的兴趣,具有新的深层次学习结构。然而,文件理解任务需要密集的信息说明,而这种说明规模和概括成本很高。提出了几种积极的学习技术,以减少批注的总体预算,同时保持深层学习模式的绩效。然而,这些技术大多只用于分类问题。但内容检测是一项更为复杂的任务,在积极学习的文献中很少加以探讨。在本文件中,我们提出\ textit{OPAD},这是一个利用积极学习内容检测任务的积极学习强化政策的新框架。拟议框架学习采购功能,以决定选择样本,同时优化任务通常具有的业绩衡量标准。此外,我们扩大标签假设的薄弱情景,以进一步降低批注的成本。我们提议对批注界面中的分类不平衡和用户反馈给予新的奖励,在积极学习的界面中很少加以探讨。我们提议采用与积极学习任务相关的格式的高级表现。我们展示了与评级相关的格式,像学习任务一样,学习了各种文件的升级框架。