Fourier 强力文件拆卸和确认的Forrier文件修复 (Fourier Document Restoration for Robust Document Dewarping and Recognition) - 专知论文

会员服务 ·

0

INFORMS · state-of-the-art · Extensibility · 稳健性 · 文档识别 ·

2022 年 3 月 18 日

Fourier Document Restoration for Robust Document Dewarping and Recognition

翻译：Fourier 强力文件拆卸和确认的Forrier文件修复

Chuhui Xue,Zichen Tian,Fangneng Zhan,Shijian Lu,Song Bai

from arxiv, Accepted by CVPR2022

State-of-the-art document dewarping techniques learn to predict 3-dimensional information of documents which are prone to errors while dealing with documents with irregular distortions or large variations in depth. This paper presents FDRNet, a Fourier Document Restoration Network that can restore documents with different distortions and improve document recognition in a reliable and simpler manner. FDRNet focuses on high-frequency components in the Fourier space that capture most structural information but are largely free of degradation in appearance. It dewarps documents by a flexible Thin-Plate Spline transformation which can handle various deformations effectively without requiring deformation annotations in training. These features allow FDRNet to learn from a small amount of simply labeled training images, and the learned model can dewarp documents with complex geometric distortion and recognize the restored texts accurately. To facilitate document restoration research, we create a benchmark dataset consisting of over one thousand camera documents with different types of geometric and photometric distortion. Extensive experiments show that FDRNet outperforms the state-of-the-art by large margins on both dewarping and text recognition tasks. In addition, FDRNet requires a small amount of simply labeled training data and is easy to deploy.

翻译：最新的文档扭曲技术学会预测在处理非正常扭曲或大变异的文件时容易出错的文件的三维信息。本文展示了FDRNet, 即FDRNet。 FDRNet是一个傅里叶文件恢复网络, 它可以以可靠和简单的方式恢复不同扭曲的文件, 并改进对文件的识别。 FDRNet 侧重于傅里叶空间的高频部件, 收集大多数结构信息, 但外观基本没有退化。它通过灵活Thin- Plate Spline转换使文件发生偏差, 可以有效处理各种变形, 而无需在培训中进行变形说明。这些功能使得FDRNet能够从少量简单的标签化培训图像中学习, 所学的模型可以以复杂的几何扭曲的方式解动文件, 并准确地识别已修复的文本。为了便利文件恢复研究, 我们创建了一套基准数据集, 由一千多个具有不同类型几何和光度扭曲的相机文件组成。广泛的实验显示 FDRNet 超越了大边际的状态, 。此外, FDRNet需要少量的简单配置数据。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

专知会员服务

22+阅读 · 2019年12月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

玉米响应干旱胁迫的甲基化调控与分子机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

面向芯片级的多核处理器故障恢复方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向大规模分布式内存的非结构化数据管理系统关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

非精确点集的计算几何优化算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

移动云计算模式下的场景文本感知方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

虚拟化云计算平台内存资源调度技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

1+阅读 · 2011年12月31日

基于FPGA+ARM的电力谐波检测方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

面向GIS的文本空间关系解析机制研究

国家自然科学基金

1+阅读 · 2009年12月31日

Adaptive Non-linear Filtering Technique for Image Restoration

Arxiv

1+阅读 · 2022年4月20日

A comparison of different atmospheric turbulence simulation methods for image restoration

A comparison of different atmospheric turbulence simulation methods for image restoration

Arxiv

1+阅读 · 2022年4月19日

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Arxiv

0+阅读 · 2022年4月19日

Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

Arxiv

0+阅读 · 2022年4月19日

Warped Dynamic Linear Models for Time Series of Counts

Warped Dynamic Linear Models for Time Series of Counts

Arxiv

0+阅读 · 2022年4月15日

FasterVideo: Efficient Online Joint Object Detection And Tracking

FasterVideo: Efficient Online Joint Object Detection And Tracking

Arxiv

0+阅读 · 2022年4月15日

Imposing Consistency for Optical Flow Estimation

Arxiv

0+阅读 · 2022年4月14日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

专知会员服务

22+阅读 · 2019年12月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《运用建模与仿真赋能多域作战》报告

《一种分层混合人工智能方法：在战斗模拟中整合深度强化学习与脚本代理》

《美国国家安全战略（2025年）》

《赋能作战：美国关于战争时期的电网教训》报告

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Adaptive Non-linear Filtering Technique for Image Restoration

Arxiv

1+阅读 · 2022年4月20日

A comparison of different atmospheric turbulence simulation methods for image restoration

A comparison of different atmospheric turbulence simulation methods for image restoration

Arxiv

1+阅读 · 2022年4月19日

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Arxiv

0+阅读 · 2022年4月19日

Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

Arxiv

0+阅读 · 2022年4月19日

Warped Dynamic Linear Models for Time Series of Counts

Warped Dynamic Linear Models for Time Series of Counts

Arxiv

0+阅读 · 2022年4月15日

FasterVideo: Efficient Online Joint Object Detection And Tracking

FasterVideo: Efficient Online Joint Object Detection And Tracking

Arxiv

0+阅读 · 2022年4月15日

Imposing Consistency for Optical Flow Estimation

Arxiv

0+阅读 · 2022年4月14日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

玉米响应干旱胁迫的甲基化调控与分子机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

面向芯片级的多核处理器故障恢复方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向大规模分布式内存的非结构化数据管理系统关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

非精确点集的计算几何优化算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

移动云计算模式下的场景文本感知方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

虚拟化云计算平台内存资源调度技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

1+阅读 · 2011年12月31日

基于FPGA+ARM的电力谐波检测方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

面向GIS的文本空间关系解析机制研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员