Successful Artificial Intelligence systems often require numerous labeled data to extract information from document images. In this paper, we investigate the problem of improving the performance of Artificial Intelligence systems in understanding document images, especially in cases where training data is limited. We address the problem by proposing a novel finetuning method using reinforcement learning. Our approach treats the Information Extraction model as a policy network and uses policy gradient training to update the model to maximize combined reward functions that complement the traditional cross-entropy losses. Our experiments on four datasets using labels and expert feedback demonstrate that our finetuning mechanism consistently improves the performance of a state-of-the-art information extractor, especially in the small training data regime.
翻译:成功的人造情报系统往往需要大量标签数据来从文件图像中提取信息。在本文件中,我们调查了改进人造情报系统在理解文件图像方面绩效的问题,特别是在培训数据有限的情况下。我们通过提出使用强化学习的新颖的微调方法来解决这个问题。我们的方法将信息提取模型作为一种政策网络,并使用政策梯度培训来更新模型,以最大限度地发挥综合奖励功能,补充传统的跨热带损失。我们利用标签和专家反馈对四个数据集进行的实验表明,我们的微调机制不断改进最先进的信息提取器的绩效,特别是在小型培训数据系统中。