Table structure recognition is an essential part for making machines understand tables. Its main task is to recognize the internal structure of a table. However, due to the complexity and diversity in their structure and style, it is very difficult to parse the tabular data into the structured format which machines can understand easily, especially for complex tables. In this paper, we introduce Split, Embed and Merge (SEM), an accurate table structure recognizer. Our model takes table images as input and can correctly recognize the structure of tables, whether they are simple or a complex tables. SEM is mainly composed of three parts, splitter, embedder and merger. In the first stage, we apply the splitter to predict the potential regions of the table row (column) separators, and obtain the fine grid structure of the table. In the second stage, by taking a full consideration of the textual information in the table, we fuse the output features for each table grid from both vision and language modalities. Moreover, we achieve a higher precision in our experiments through adding additional semantic features. Finally, we process the merging of these basic table grids in a self-regression manner. The correspondent merging results is learned through the attention mechanism. In our experiments, SEM achieves an average F1-Measure of 97.11% on the SciTSR dataset which outperforms other methods by a large margin. We also won the first place in the complex table and third place in all tables in ICDAR 2021 Competition on Scientific Literature Parsing, Task-B. Extensive experiments on other publicly available datasets demonstrate that our model achieves state-of-the-art.
翻译:表格结构识别是使机器理解表格的一个基本部分。 它的主要任务是识别表格的内部结构。 但是, 由于表格的结构和风格的复杂性和多样性, 很难将表格数据分析成结构化格式, 机器可以很容易理解, 特别是复杂的表格。 在本文中, 我们引入了 Split、 Embed 和 合并( SEM), 一个准确的表格结构识别器。 我们的模型将表格图像作为输入, 可以正确识别表格的结构, 无论是简单的还是复杂的表格。 SEM 主要由三个部分、 分裂、 嵌入和合并组成。 在第一阶段, 我们使用拆分器来预测表格行( 校队) 的可能的区域, 并获取表格的精细网格结构 。 在第二阶段, 我们通过充分考虑表格中的文本信息, 我们从视觉和语言模式两种方式将每个表格的输出特性结合起来。 此外, 我们通过添加更多的语义特征, 最终我们处理这些基本表格网格的合并过程以自我- CD21 缩略图 的方式预测潜在的区域, 将数据结果通过S- sal- sal- sal sal sal sal sal laveal a laveal a lax sal sal sal sal a laveil sal slation sal sal sal sal sal laveal.