ST-KeyS: 历史手写文件中关键词斑点自控变换器</s> (ST-KeyS: Self-Supervised Transformer for Keyword Spotting in Historical Handwritten Documents) - 专知论文

会员服务 ·

0

变换 · 表示 · Learning · MoDELS · state-of-the-art ·

2023 年 3 月 6 日

ST-KeyS: Self-Supervised Transformer for Keyword Spotting in Historical Handwritten Documents

翻译：ST-KeyS: 历史手写文件中关键词斑点自控变换器

Sana Khamekhem Jemni,Sourour Ammar,Mohamed Ali Souibgui,Yousri Kessentini,Abbas Cheddad

Keyword spotting (KWS) in historical documents is an important tool for the initial exploration of digitized collections. Nowadays, the most efficient KWS methods are relying on machine learning techniques that require a large amount of annotated training data. However, in the case of historical manuscripts, there is a lack of annotated corpus for training. To handle the data scarcity issue, we investigate the merits of the self-supervised learning to extract useful representations of the input data without relying on human annotations and then using these representations in the downstream task. We propose ST-KeyS, a masked auto-encoder model based on vision transformers where the pretraining stage is based on the mask-and-predict paradigm, without the need of labeled data. In the fine-tuning stage, the pre-trained encoder is integrated into a siamese neural network model that is fine-tuned to improve feature embedding from the input images. We further improve the image representation using pyramidal histogram of characters (PHOC) embedding to create and exploit an intermediate representation of images based on text attributes. In an exhaustive experimental evaluation on three widely used benchmark datasets (Botany, Alvermann Konzilsprotokolle and George Washington), the proposed approach outperforms state-of-the-art methods trained on the same datasets.

翻译：历史文件中的关键字定位( KWS) 是首次探索数字化收藏的重要工具。如今, 最有效的 KWS 方法依赖于机器学习技术, 需要大量的附加说明的培训数据。但是, 在历史手稿方面, 缺乏附加说明的培训程序。为了处理数据稀缺问题, 我们调查自监督学习的优点, 以便在不依赖人类注释的情况下获取输入数据的有用表达方式, 然后在下游任务中使用这些表达方式。我们提议ST- KeyS, 一种基于视觉变压器的蒙面自动编码模型, 其基础是预培训阶段基于遮罩和预设模式的图像, 而不需要贴标签的数据。在微调阶段, 将预培训的编码器纳入一个精密的神经网络模型, 以便改进输入图像图像嵌入的功能, 并在下游任务中使用这些图像显示方式。我们提议, ST- KeyS( PHOC) 嵌入和利用基于文本属性的图像中间显示器。在详细实验阶段评估中, 使用经过广泛训练的3项标准的Georgard- glas- Stategard 方法, 。</s>

0

相关内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

重组人磷脂酶D2干预哮喘中Treg特征的研究

国家自然科学基金

0+阅读 · 2016年12月31日

BMP2-BMSCs对胶质瘤干细胞增殖和分化的调控作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

图像分割中若干图论问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

抑制Hedgehog信号通路的植物C21甾体化合物的构效关系、结构优化及抗肿瘤作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

趋化因子受体CXCR4和“同胞”CXCR7在缺氧致视网膜新生血管中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

异常糖基化修饰在胆管癌中的病理意义及其早期诊断价值研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于聚吡咯载体的Ag/AgCl表面等离子体可见光催化剂的制备及催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

Tissue Classification During Needle Insertion Using Self-Supervised Contrastive Learning and Optical Coherence Tomography

Arxiv

0+阅读 · 2023年4月26日

Self-Supervised Multi-Modal Sequential Recommendation

Arxiv

0+阅读 · 2023年4月26日

Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices

Arxiv

0+阅读 · 2023年4月22日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维与高维空间中对潜在表征的分析、建模与变换

《美军使用大语言模型技术生成领域特定文档》2025最新379页

【NeurIPS 2025】以语言为中心的全模态表征学习的可扩展性研究

智能体化多模态大语言模型综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

相关论文

Tissue Classification During Needle Insertion Using Self-Supervised Contrastive Learning and Optical Coherence Tomography

Arxiv

0+阅读 · 2023年4月26日

Self-Supervised Multi-Modal Sequential Recommendation

Arxiv

0+阅读 · 2023年4月26日

Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices

Arxiv

0+阅读 · 2023年4月22日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

相关基金

重组人磷脂酶D2干预哮喘中Treg特征的研究

国家自然科学基金

0+阅读 · 2016年12月31日

BMP2-BMSCs对胶质瘤干细胞增殖和分化的调控作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

图像分割中若干图论问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

抑制Hedgehog信号通路的植物C21甾体化合物的构效关系、结构优化及抗肿瘤作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

趋化因子受体CXCR4和“同胞”CXCR7在缺氧致视网膜新生血管中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

异常糖基化修饰在胆管癌中的病理意义及其早期诊断价值研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于聚吡咯载体的Ag/AgCl表面等离子体可见光催化剂的制备及催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员