GitHub项目推荐 | 场景文字图像增广工具 Scene Text Image Transformer - 专知

会员服务 ·

0

GitHub项目推荐 | 场景文字图像增广工具 Scene Text Image Transformer

2018 年 12 月 11 日 AI研习社

Scene Text Image Transformer是用于场景文本数据增强的工具。我们提供的工具可以避免过度拟合并获得模型的稳健性。

目前我们专注于裁剪场景文本图像的形状。检测和识别任务的下一个版本将在稍后发布。

项目地址：

https://github.com/Canjie-Luo/Scene-Text-Image-Transformer

环境要求

GCC 4.8.*
Python 2.7.*
Boost 1.67
OpenCV 2.4.*

我们推荐使用 Anaconda 去管理你的依赖环境。例如：

conda install boost=1.67.0

安装

建立目录：

mkdir build    
cd build
cmake -D CUDA_USE_STATIC_CUDA_RUNTIME=OFF ..
make

将Augment.so复制到目标文件夹，然后按照demo.py的样例使用该工具。

cp Augment.so ..    
cd ..
python demo.py

演示

Distortion - 变形

Stretch - 伸展

Perspective - 透视

速度

使用2.0GHz的CPU转换大小（H：64，W：200）的图像仅需3ms不到。可以通过动态调用多进程批处理采样器来加速该过程，例如在PyTorch中设置"num_workers"。

识别上的改进

我们比较了仅使用相应的小训练集训练 CRNN 的准确性。

数据集	IIIT5K	IC13	IC15
没有数据扩充	40.8%	6.8%	8.7%
有数据扩充	53.4%	9.6%	24.9%

引用

@inproceedings{schaefer2006image,
  title={Image deformation using moving least squares},
  author={Schaefer, Scott and McPhail, Travis and Warren, Joe},
  booktitle={ACM transactions on graphics (TOG)},
  volume={25},
  number={3},
  pages={533--540},
  year={2006},
  organization={ACM}
}

致谢

该工具是@cxcxcxcx's imgwarp-opencv 和 @Yati Sagade's opencv-ndarray-conversion的结合。谢谢你们的贡献。

代码主要提交者：Canjie-Luo ，来自SCUT DLVC-Lab（华南理工大学深度学习与视觉计算实验室）

注意事项

该工具仅用于学术研究目的。

如需了解更多详情，请点击文末 阅读原文 进行了解

【AI求职百题斩】已经悄咪咪上线啦，还不赶紧来答题？！

点击 阅读原文 查看本文更多内容↙

登录查看更多

5

相关内容

Transformer

Transformer是谷歌发表的论文《Attention Is All You Need》提出一种完全基于Attention的翻译架构

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

Python地理数据处理，362页pdf，Geoprocessing with Python

Python地理数据处理，362页pdf，Geoprocessing with Python

专知会员服务

116+阅读 · 2020年5月24日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【新书】实用的机器学习和图像处理，177页pdf，用于面部识别、目标检测和使用Python的模式识别

【新书】实用的机器学习和图像处理，177页pdf，用于面部识别、目标检测和使用Python的模式识别

专知会员服务

104+阅读 · 2020年1月18日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Github项目推荐 | 用TensorFlow 2.0实现CartoonGAN图片卡通化

Github项目推荐 | 用TensorFlow 2.0实现CartoonGAN图片卡通化

AI研习社

14+阅读 · 2019年6月9日

GitHub项目推荐 | 深度抠图(Keras/TensorFlow/OpenCV) - Deep Image Matting

GitHub项目推荐 | 深度抠图(Keras/TensorFlow/OpenCV) - Deep Image Matting

AI研习社

70+阅读 · 2018年12月29日

Github 项目推荐 | Nvidia 用于数据增强和 JPEG 图像解码的 GPU 加速库 DALI

Github 项目推荐 | Nvidia 用于数据增强和 JPEG 图像解码的 GPU 加速库 DALI

AI研习社

11+阅读 · 2018年6月27日

Github 项目推荐 | 真实全景图像强化学习 AI 平台 —— Matterport3DSimulator

Github 项目推荐 | 真实全景图像强化学习 AI 平台 —— Matterport3DSimulator

AI研习社

10+阅读 · 2018年3月6日

论文 | CVPR2017有哪些值得读的Image Caption论文？

论文 | CVPR2017有哪些值得读的Image Caption论文？

黑龙江大学自然语言处理实验室

16+阅读 · 2017年12月1日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Image Captioning: Transforming Objects into Words

Image Captioning: Transforming Objects into Words

Arxiv

7+阅读 · 2019年6月14日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

Pragmatically Informative Image Captioning with Character-Level Inference

Arxiv

7+阅读 · 2018年5月10日

Pragmatically Informative Image Captioning with Character-Level Reference

Arxiv

4+阅读 · 2018年4月15日

VIP会员

相关主题

International Conference on Conceptual Modeling

Boosting（一种模型训练加速方式）

相关VIP内容

Python地理数据处理，362页pdf，Geoprocessing with Python

Python地理数据处理，362页pdf，Geoprocessing with Python

专知会员服务

116+阅读 · 2020年5月24日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【新书】实用的机器学习和图像处理，177页pdf，用于面部识别、目标检测和使用Python的模式识别

【新书】实用的机器学习和图像处理，177页pdf，用于面部识别、目标检测和使用Python的模式识别

专知会员服务

104+阅读 · 2020年1月18日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人机协同时代的军事指挥控制演进

《英国智库：瓦解俄罗斯防空系统生产，夺回制空权》最新报告

《通过仿真与开源数据提升战略决策：机遇与局限》最新报告

《战术突击工具包：军队的“边缘”操作系统》报告

相关资讯

Github项目推荐 | 用TensorFlow 2.0实现CartoonGAN图片卡通化

Github项目推荐 | 用TensorFlow 2.0实现CartoonGAN图片卡通化

AI研习社

14+阅读 · 2019年6月9日

GitHub项目推荐 | 深度抠图(Keras/TensorFlow/OpenCV) - Deep Image Matting

GitHub项目推荐 | 深度抠图(Keras/TensorFlow/OpenCV) - Deep Image Matting

AI研习社

70+阅读 · 2018年12月29日

Github 项目推荐 | Nvidia 用于数据增强和 JPEG 图像解码的 GPU 加速库 DALI

Github 项目推荐 | Nvidia 用于数据增强和 JPEG 图像解码的 GPU 加速库 DALI

AI研习社

11+阅读 · 2018年6月27日

Github 项目推荐 | 真实全景图像强化学习 AI 平台 —— Matterport3DSimulator

Github 项目推荐 | 真实全景图像强化学习 AI 平台 —— Matterport3DSimulator

AI研习社

10+阅读 · 2018年3月6日

论文 | CVPR2017有哪些值得读的Image Caption论文？

论文 | CVPR2017有哪些值得读的Image Caption论文？

黑龙江大学自然语言处理实验室

16+阅读 · 2017年12月1日

相关论文

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Image Captioning: Transforming Objects into Words

Image Captioning: Transforming Objects into Words

Arxiv

7+阅读 · 2019年6月14日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

Pragmatically Informative Image Captioning with Character-Level Inference

Arxiv

7+阅读 · 2018年5月10日

Pragmatically Informative Image Captioning with Character-Level Reference

Arxiv

4+阅读 · 2018年4月15日

大家都在搜

朱克爱德华兹家族

大型语言模型

蓝牙安全攻防

冷启动，0预算，如何借助分销裂变引爆私域用户增长？

微信扫码咨询专知VIP会员