This paper presents our solution for ICDAR 2021 competition on scientific literature parsing taskB: table recognition to HTML. In our method, we divide the table content recognition task into foursub-tasks: table structure recognition, text line detection, text line recognition, and box assignment.Our table structure recognition algorithm is customized based on MASTER [1], a robust image textrecognition algorithm. PSENet [2] is used to detect each text line in the table image. For text linerecognition, our model is also built on MASTER. Finally, in the box assignment phase, we associatedthe text boxes detected by PSENet with the structure item reconstructed by table structure prediction,and fill the recognized content of the text line into the corresponding item. Our proposed methodachieves a 96.84% TEDS score on 9,115 validation samples in the development phase, and a 96.32%TEDS score on 9,064 samples in the final evaluation phase.
翻译:本文介绍了我们对科学文献分析任务B的 ICDAR 2021 科学文献竞赛的解决方案: HTML 的表识别。 在方法上,我们将表格内容识别任务分为四个子任务: 表格结构识别、 文本线检测、 文本线识别和框分配。 我们的表格结构识别算法基于一个强大的图像文本识别算法MASTER [1], 一种强大的图像文本识别算法。 PSENet [2] 用于检测表格图像中的每条文本行。 关于文本识别, 我们的模型也建在 MASTER 上。 最后, 在框分配阶段, 我们把 PSENet 检测到的文本框与通过表格结构预测重建的结构项目联系起来, 并在相应项目中填写了文本线的公认内容。 我们提议的方法在开发阶段的9 115 个验证样本上取得了96.84%的TEDS 分数, 在最后评估阶段的9 064 个样本上达到了96.32%的TES分。