边际和机器学习:边际收藏的手写文本识别</s> (Marginalia and machine learning: Handwritten text recognition for Marginalia Collections) - 专知论文

会员服务 ·

0

Learning · Faster R-CNN · HTTPS · 边缘化 · R-CNN ·

2023 年 3 月 10 日

Marginalia and machine learning: Handwritten text recognition for Marginalia Collections

翻译：边际和机器学习:边际收藏的手写文本识别

Adam Axelsson,Liang Cheng,Jonas Frankemölle,Ekta Vats

from arxiv, Work under progress

The pressing need for digitization of historical document collections has led to a strong interest in designing computerised image processing methods for automatic handwritten text recognition (HTR). Handwritten text possesses high variability due to different writing styles, languages and scripts. Training an accurate and robust HTR system calls for data-efficient approaches due to the unavailability of sufficient amounts of annotated multi-writer text. A case study on an ongoing project ``Marginalia and Machine Learning" is presented here that focuses on automatic detection and recognition of handwritten marginalia texts i.e., text written in margins or handwritten notes. Faster R-CNN network is used for detection of marginalia and AttentionHTR is used for word recognition. The data comes from early book collections (printed) found in the Uppsala University Library, with handwritten marginalia texts. Source code and pretrained models are available at https://github.com/ektavats/Project-Marginalia.

翻译：由于对历史文件收藏的数字化的迫切需要,人们非常关注为自动手写文本识别设计计算机化图像处理方法(HTR),手写文本由于不同的写作风格、语言和脚本而变化很大。培训一个准确和健全的HTR系统需要数据效率高的方法,因为没有足够数量的附加说明的多文文本。此处介绍了关于正在进行的项目“Marginialia和机器学习”的案例研究,重点是自动检测和识别手写边际文字,即边际文字或手写笔记。快速R-CNN网络用于检测边际文字,用注意力HTR来识别文字。数据来自乌普萨拉大学图书馆的早期书籍收藏(印刷版),手写边际文字文本。源代码和预培训模式见https://github.com/ektavats/Project-Marginalia。</s>

0

相关内容

Learning

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

二维过渡金属硫化物的光电子特性及其外场调控

国家自然科学基金

0+阅读 · 2015年12月31日

NLRP3在糖尿病肾病中的作用及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

ARDS时Wnt/β-catenin-p130/E2F4调控细胞周期影响MSC向肺泡上皮分化的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

SF3B1基因调节Bcl-x可变剪接参与骨髓增生异常综合征-RARS红系无效造血的研究

国家自然科学基金

0+阅读 · 2013年12月31日

线性离散周期系统的鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

共固定化TNF-α IFN-γ39640;抑制HeLa的分子机理

国家自然科学基金

0+阅读 · 2009年12月31日

离子液体中木质纤维全组分溶解和分离的研究

国家自然科学基金

0+阅读 · 2009年12月31日

SnO2-CuO基微/纳异质结构阵列气敏特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Offline RL for Natural Language Generation with Implicit Language Q Learning

Arxiv

0+阅读 · 2023年5月1日

Comparison of SAT-based and ASP-based Algorithms for Inconsistency Measurement

Arxiv

0+阅读 · 2023年4月28日

Synergy of Machine and Deep Learning Models for Multi-Painter Recognition

Arxiv

0+阅读 · 2023年4月28日

Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

Arxiv

0+阅读 · 2023年4月28日

Machine Learning: Algorithms, Models, and Applications

Arxiv

23+阅读 · 2022年1月6日

Automated Graph Machine Learning: Approaches, Libraries and Directions

Arxiv

20+阅读 · 2022年1月4日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

Multimodal Machine Learning: A Survey and Taxonomy

Arxiv

151+阅读 · 2017年8月1日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型时代的文档智能：综述

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

文档视觉问答简述

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Offline RL for Natural Language Generation with Implicit Language Q Learning

Arxiv

0+阅读 · 2023年5月1日

Comparison of SAT-based and ASP-based Algorithms for Inconsistency Measurement

Arxiv

0+阅读 · 2023年4月28日

Synergy of Machine and Deep Learning Models for Multi-Painter Recognition

Arxiv

0+阅读 · 2023年4月28日

Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

Arxiv

0+阅读 · 2023年4月28日

Machine Learning: Algorithms, Models, and Applications

Arxiv

23+阅读 · 2022年1月6日

Automated Graph Machine Learning: Approaches, Libraries and Directions

Arxiv

20+阅读 · 2022年1月4日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

Multimodal Machine Learning: A Survey and Taxonomy

Arxiv

151+阅读 · 2017年8月1日

相关基金

二维过渡金属硫化物的光电子特性及其外场调控

国家自然科学基金

0+阅读 · 2015年12月31日

NLRP3在糖尿病肾病中的作用及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

ARDS时Wnt/β-catenin-p130/E2F4调控细胞周期影响MSC向肺泡上皮分化的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

SF3B1基因调节Bcl-x可变剪接参与骨髓增生异常综合征-RARS红系无效造血的研究

国家自然科学基金

0+阅读 · 2013年12月31日

线性离散周期系统的鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

共固定化TNF-α IFN-γ39640;抑制HeLa的分子机理

国家自然科学基金

0+阅读 · 2009年12月31日

离子液体中木质纤维全组分溶解和分离的研究

国家自然科学基金

0+阅读 · 2009年12月31日

SnO2-CuO基微/纳异质结构阵列气敏特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员