Efficient OCR 用于构建多样化数字历史 (Efficient OCR for Building a Diverse Digital History) - 专知论文

会员服务 ·

0

OCR · 构建 · 视觉特征 · 视觉模型 · 图像检索 ·

2023 年 4 月 5 日

Efficient OCR for Building a Diverse Digital History

翻译：Efficient OCR 用于构建多样化数字历史

Jacob Carlson,Tom Bryan,Melissa Dell

Thousands of users consult digital archives daily, but the information they can access is unrepresentative of the diversity of documentary history. The sequence-to-sequence architecture typically used for optical character recognition (OCR) - which jointly learns a vision and language model - is poorly extensible to low-resource document collections, as learning a language-vision model requires extensive labeled sequences and compute. This study models OCR as a character level image retrieval problem, using a contrastively trained vision encoder. Because the model only learns characters' visual features, it is more sample efficient and extensible than existing architectures, enabling accurate OCR in settings where existing solutions fail. Crucially, the model opens new avenues for community engagement in making digital history more representative of documentary history.

翻译：每天都有数千名用户查询数字档案，但他们能访问的信息在文献史上的多样性方面不够代表性。句子到句子架构通常用于光学字符识别（OCR），它同时学习视觉和语言模型，但它难以扩展到低资源文件集合上，因为学习语言-视觉模型需要大量标记序列和计算。本研究将OCR建模为一个基于字符级图像检索的问题，使用对比训练的视觉编码器。因为该模型仅学习字符的视觉特征，所以在样本效率和可扩展性方面比现有架构更加高效，可以在现有解决方案失败的情况下实现准确的OCR。至关重要的是，该模型为社区参与在使数字史料更具文献史代表性方面开辟了新途径。

0

相关内容

OCR

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

上百份文字的检测与识别资源，包含数据集、code和paper

上百份文字的检测与识别资源，包含数据集、code和paper

数据挖掘入门与实战

17+阅读 · 2017年12月7日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

樟疫霉致病性相关GPCR-PIPK鉴定与机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于插层纳米复合材料和核酸适体电化学传感器的构建及高灵敏环境激素分析

国家自然科学基金

0+阅读 · 2013年12月31日

牙种植技术中的多参数识别问题的计算方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于Ontology的藏文语料库检索关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向Web文本的属性和属性值知识获取方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

用于EPICS输入输出控制器的冗余技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

手性簇（聚）合物晶体的设计组装与光磁性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

应用候选基因关联分析剖析棉花纤维品质的遗传基础

国家自然科学基金

0+阅读 · 2009年12月31日

关于汉字手写信息处理及计算机书法生成若干算法问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Web Service QoS的多维多尺度模型及评估、预测方法的研究

国家自然科学基金

1+阅读 · 2008年12月31日

Synthesizing Diverse Human Motions in 3D Indoor Scenes

Arxiv

0+阅读 · 2023年5月23日

EDIS: Entity-Driven Image Search over Multimodal Web Content

Arxiv

0+阅读 · 2023年5月23日

Training Diffusion Models with Reinforcement Learning

Arxiv

0+阅读 · 2023年5月22日

Offline Reinforcement Learning with Additional Covering Distributions

Arxiv

0+阅读 · 2023年5月22日

Reclaiming the Digital Commons: A Public Data Trust for Training Data

Arxiv

0+阅读 · 2023年5月21日

A Topic-aware Summarization Framework with Different Modal Side Information

Arxiv

0+阅读 · 2023年5月19日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Learning Implicit Fields for Generative Shape Modeling

Learning Implicit Fields for Generative Shape Modeling

Arxiv

10+阅读 · 2018年12月6日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

卫星导航技术发展综述

《美军"僚机"联合能力技术演示项目：有人-无人火炮作战》41页报告

美军条令《火力指挥》116页

可解释的人工智能在生物医学图像分析中的应用综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

上百份文字的检测与识别资源，包含数据集、code和paper

上百份文字的检测与识别资源，包含数据集、code和paper

数据挖掘入门与实战

17+阅读 · 2017年12月7日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Synthesizing Diverse Human Motions in 3D Indoor Scenes

Arxiv

0+阅读 · 2023年5月23日

EDIS: Entity-Driven Image Search over Multimodal Web Content

Arxiv

0+阅读 · 2023年5月23日

Training Diffusion Models with Reinforcement Learning

Arxiv

0+阅读 · 2023年5月22日

Offline Reinforcement Learning with Additional Covering Distributions

Arxiv

0+阅读 · 2023年5月22日

Reclaiming the Digital Commons: A Public Data Trust for Training Data

Arxiv

0+阅读 · 2023年5月21日

A Topic-aware Summarization Framework with Different Modal Side Information

Arxiv

0+阅读 · 2023年5月19日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Learning Implicit Fields for Generative Shape Modeling

Learning Implicit Fields for Generative Shape Modeling

Arxiv

10+阅读 · 2018年12月6日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

樟疫霉致病性相关GPCR-PIPK鉴定与机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于插层纳米复合材料和核酸适体电化学传感器的构建及高灵敏环境激素分析

国家自然科学基金

0+阅读 · 2013年12月31日

牙种植技术中的多参数识别问题的计算方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于Ontology的藏文语料库检索关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向Web文本的属性和属性值知识获取方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

用于EPICS输入输出控制器的冗余技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

手性簇（聚）合物晶体的设计组装与光磁性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

应用候选基因关联分析剖析棉花纤维品质的遗传基础

国家自然科学基金

0+阅读 · 2009年12月31日

关于汉字手写信息处理及计算机书法生成若干算法问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Web Service QoS的多维多尺度模型及评估、预测方法的研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员