A table arranging data in rows and columns is a very effective data structure, which has been widely used in business and scientific research. Considering large-scale tabular data in online and offline documents, automatic table recognition has attracted increasing attention from the document analysis community. Though human can easily understand the structure of tables, it remains a challenge for machines to understand that, especially due to a variety of different table layouts and styles. Existing methods usually model a table as either the markup sequence or the adjacency matrix between different table cells, failing to address the importance of the logical location of table cells, e.g., a cell is located in the first row and the second column of the table. In this paper, we reformulate the problem of table structure recognition as the table graph reconstruction, and propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells. Experimental results on three popular table recognition datasets and a new dataset with table graph annotations (TableGraph-350K) demonstrate the effectiveness of the proposed TGRNet for table structure recognition. Code and annotations will be made publicly available.
翻译:以行和列排列数据的表格是一种非常有效的数据结构,在商业和科学研究中广泛使用。考虑到在线和离线文件中的大规模表格数据,自动表识别已引起文件分析界越来越多的注意。虽然人类可以很容易地理解表格的结构,但机器仍难以理解这一点,特别是由于不同的表格布局和风格不同,特别是由于不同的表格布局和样式不同,现有方法通常将表格作为标记序列或不同表格单元格之间的相近矩阵来模拟,未能解决表格单元格逻辑位置的重要性,例如,一个单元格位于表格第一行和第二列。在本文件中,我们重新将表结构识别问题作为表格图的重建,并提议一个端对端可培训的表图重建网络(TGRNet),以确认表格结构。具体地说,拟议的方法有两个主要分支,一个细胞检测分支和一个单元格逻辑位置分支,以共同预测不同单元格的空间位置和逻辑位置。三个流行表格识别数据集的实验结果,以及一个新的数据集,将显示表格图表图解的公开识别结构(GRA-K)。