TSRFormer: 变换器的表格结构识别 (TSRFormer: Table Structure Recognition with Transformers) - 专知论文

会员服务 ·

0

分离的 · 张成子空间 · 变换 · Attention · Performer ·

2022 年 8 月 9 日

TSRFormer: Table Structure Recognition with Transformers

翻译：TSRFormer: 变换器的表格结构识别

Weihong Lin,Zheng Sun,Chixiang Ma,Mingze Li,Jiawei Wang,Lei Sun,Qiang Huo

from arxiv, Accepted by ACM MultiMedia 2022

We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage DETR based separator prediction approach, dubbed \textbf{Sep}arator \textbf{RE}gression \textbf{TR}ansformer (SepRETR), to predict separation lines from table images directly. To make the two-stage DETR framework work efficiently and effectively for the separation line prediction task, we propose two improvements: 1) A prior-enhanced matching strategy to solve the slow convergence issue of DETR; 2) A new cross attention module to sample features from a high-resolution convolutional feature map directly so that high localization accuracy is achieved with low computational cost. After separation line prediction, a simple relation network based cell merging module is used to recover spanning cells. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW. Furthermore, we have validated the robustness of our approach to tables with complex structures, borderless cells, large blank spaces, empty or spanning cells as well as distorted or even curved shapes on a more challenging real-world in-house dataset.

翻译：我们提出了一个新的表格结构识别(TSRFormer)方法,即TSRFormer(TSRFormer),以强有力地识别不同表格图像中带有几何扭曲的复杂表格结构。与以往的方法不同,我们将表格分隔线预测作为一种线回归问题,而不是图像分割问题,并提出一个新的基于 DERTR 的双阶段分隔器预测(TSRFormer) 方法,称为 dubbed \ textbf{Sep}ator \ textbf{regressquenion \ textbf{TR),以直接预测与表格图像的分隔线的分隔线。为使两阶段的DETR框架能够高效和有效地为分隔线预测任务工作,我们提出了两项改进:(1) 一种先前强化的匹配战略,以解决DETR的缓慢趋同问题;(2) 一种新的交叉关注模块,从高分辨率的演动特征地图上采集样本特征,以便用低计算成本实现高的本地化。在分离线预测后,一个基于单元格合并模块的简单关系网络连接网路段组合模式用于恢复跨单元格。有了这些新技术,我们的TRTRFSFermer-real-ruder-rual-rual-ruder-de-rual-de-ruder-st-st-ruder-rual-st-rual-rual-rual-st-st-st-st-st-st-rub-stal-st-st-st-st-st-rub-rub-rub-ruction-st-st-st-st-st-st-st-st-st-st-st-st-st-ruction-ruction-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-d-d-sal-d-d-d-d-stal-stal-stal-st-st-st-stal-stal-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-st-d-

1

相关内容

分离的

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

表面增强拉曼-光热诊疗多功能金@石墨烯复合探针研究

国家自然科学基金

0+阅读 · 2015年12月31日

高血压患者Corin基因变异对其蛋白结构及酶功能影响的研究

国家自然科学基金

0+阅读 · 2015年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

鼻咽癌3号染色体中新型抑癌基因的鉴定与功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白H3变体Cse4在着丝粒的精确定位对维持基因组稳定性的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

具有光热治疗功能的纳米粒子/聚合物复合超结构

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

遗传性LCAT缺陷症抗动脉粥样硬化发生的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

金属-有机笼状结构的合成与生物小分子识别

国家自然科学基金

0+阅读 · 2008年12月31日

Revisiting Structured Dropout

Arxiv

0+阅读 · 2022年10月5日

Water Simulation and Rendering from a Still Photograph

Arxiv

0+阅读 · 2022年10月5日

Learning the Spectrogram Temporal Resolution for Audio Classification

Arxiv

0+阅读 · 2022年10月5日

A Dataset-free Deep learning Method for Low-Dose CT Image Reconstruction

Arxiv

0+阅读 · 2022年10月5日

Point Cloud Recognition with Position-to-Structure Attention Transformers

Arxiv

0+阅读 · 2022年10月5日

Concurrent Recognition and Segmentation with Adaptive Segment Tokens

Arxiv

0+阅读 · 2022年10月1日

Towards End-to-end Handwritten Document Recognition

Arxiv

0+阅读 · 2022年9月30日

Husformer: A Multi-Modal Transformer for Multi-Modal Human State Recognition

Arxiv

0+阅读 · 2022年9月30日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

VIP会员

文章信息

相关主题

张成子空间

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Revisiting Structured Dropout

Arxiv

0+阅读 · 2022年10月5日

Water Simulation and Rendering from a Still Photograph

Arxiv

0+阅读 · 2022年10月5日

Learning the Spectrogram Temporal Resolution for Audio Classification

Arxiv

0+阅读 · 2022年10月5日

A Dataset-free Deep learning Method for Low-Dose CT Image Reconstruction

Arxiv

0+阅读 · 2022年10月5日

Point Cloud Recognition with Position-to-Structure Attention Transformers

Arxiv

0+阅读 · 2022年10月5日

Concurrent Recognition and Segmentation with Adaptive Segment Tokens

Arxiv

0+阅读 · 2022年10月1日

Towards End-to-end Handwritten Document Recognition

Arxiv

0+阅读 · 2022年9月30日

Husformer: A Multi-Modal Transformer for Multi-Modal Human State Recognition

Arxiv

0+阅读 · 2022年9月30日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

相关基金

表面增强拉曼-光热诊疗多功能金@石墨烯复合探针研究

国家自然科学基金

0+阅读 · 2015年12月31日

高血压患者Corin基因变异对其蛋白结构及酶功能影响的研究

国家自然科学基金

0+阅读 · 2015年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

鼻咽癌3号染色体中新型抑癌基因的鉴定与功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白H3变体Cse4在着丝粒的精确定位对维持基因组稳定性的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

具有光热治疗功能的纳米粒子/聚合物复合超结构

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

遗传性LCAT缺陷症抗动脉粥样硬化发生的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

金属-有机笼状结构的合成与生物小分子识别

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员