Nemo 逆向文本正常化:从发展到生产 (NeMo Inverse Text Normalization: From Development To Production) - 专知论文

会员服务 ·

0

语音识别 · 规范化 · 规范化的 · Readability · 自动语音识别 ·

2021 年 4 月 11 日

NeMo Inverse Text Normalization: From Development To Production

翻译：Nemo 逆向文本正常化:从发展到生产

Yang Zhang,Evelina Bakhturina,Kyle Gorman,Boris Ginsburg

Inverse text normalization (ITN) converts spoken-domain automatic speech recognition (ASR) output into written-domain text to improve the readability of the ASR output. Many state-of-the-art ITN systems use hand-written weighted finite-state transducer(WFST) grammars since this task has extremely low tolerance to unrecoverable errors. We introduce an open-source Python WFST-based library for ITN which enables a seamless path from development to production. We describe the specification of ITN grammar rules for English, but the library can be adapted for other languages. It can also be used for written-to-spoken text normalization. We evaluate the NeMo ITN library using a modified version of the Google Text normalization dataset.

翻译：反正文本正常化( ITN) 将口头主页自动语音识别( ASR) 输出转换为书面主页文本, 以提高 ASR 输出的可读性。许多最先进的 ITN 系统使用手写加权定调器语法马( WFST), 因为此项任务对无法收回的错误的容忍度极低。我们为 ITN 引入了一个基于开放源的 Python WFST 图书馆, 使从开发到制作的路径能够畅通无阻。我们描述了 ITN 语法规则的英文规格, 但该图书馆可以调整为其他语言。也可以用于书面对语言的文本正常化。我们使用谷歌文本正常化数据集的修改版本来评估 Nemo ITN 图书馆。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

已删除

将门创投

4+阅读 · 2019年6月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Arxiv

0+阅读 · 2021年6月3日

Dompteur: Taming Audio Adversarial Examples

Arxiv

0+阅读 · 2021年6月3日

Attention-based Contextual Language Model Adaptation for Speech Recognition

Arxiv

0+阅读 · 2021年6月2日

FairBatch: Batch Selection for Model Fairness

Arxiv

0+阅读 · 2021年6月2日

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

Arxiv

0+阅读 · 2021年6月2日

A Continuous Liveness Detection System for Text-independent Speaker Verification

Arxiv

0+阅读 · 2021年6月2日

Group Normalization

Arxiv

7+阅读 · 2018年3月22日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

What is Wrong with Topic Modeling? (and How to Fix it Using Search-based Software Engineering)

Arxiv

3+阅读 · 2018年2月20日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《关于俄乌战争的系列文章》2025最新70页

《军事行动中的人机AI编队本体模型》

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

《俄罗斯常规军队能力现状及重建》2025最新124页

相关资讯

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

已删除

将门创投

4+阅读 · 2019年6月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Arxiv

0+阅读 · 2021年6月3日

Dompteur: Taming Audio Adversarial Examples

Arxiv

0+阅读 · 2021年6月3日

Attention-based Contextual Language Model Adaptation for Speech Recognition

Arxiv

0+阅读 · 2021年6月2日

FairBatch: Batch Selection for Model Fairness

Arxiv

0+阅读 · 2021年6月2日

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

Arxiv

0+阅读 · 2021年6月2日

A Continuous Liveness Detection System for Text-independent Speaker Verification

Arxiv

0+阅读 · 2021年6月2日

Group Normalization

Arxiv

7+阅读 · 2018年3月22日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

What is Wrong with Topic Modeling? (and How to Fix it Using Search-based Software Engineering)

Arxiv

3+阅读 · 2018年2月20日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员