泰米尔语和Kannada语自动语音识别词典学和分解技术 (Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada) - 专知论文

会员服务 ·

0

语音识别 · 自动语音识别 · 字典学习 · Learning · 估计/估计量 ·

2022 年 7 月 27 日

Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada

翻译：泰米尔语和Kannada语自动语音识别词典学和分解技术

Madhavaraj A,Bharathi Pilar,Ramakrishnan A G

We present automatic speech recognition (ASR) systems for Tamil and Kannada based on subword modeling to effectively handle unlimited vocabulary due to the highly agglutinative nature of the languages. We explore byte pair encoding (BPE), and proposed a variant of this algorithm named extended-BPE, and Morfessor tool to segment each word as subwords. We have effectively incorporated maximum likelihood (ML) and Viterbi estimation techniques with weighted finite state transducers (WFST) framework in these algorithms to learn the subword dictionary from a large text corpus. Using the learnt subword dictionary, the words in training data transcriptions are segmented to subwords and we train deep neural network ASR systems which recognize subword sequence for any given test speech utterance. The output subword sequence is then post-processed using deterministic rules to get the final word sequence such that the actual number of words that can be recognized is much larger. For Tamil ASR, We use 152 hours of data for training and 65 hours for testing, whereas for Kannada ASR, we use 275 hours for training and 72 hours for testing. Upon experimenting with different combination of segmentation and estimation techniques, we find that the word error rate (WER) reduces drastically when compared to the baseline word-level ASR, achieving a maximum absolute WER reduction of 6.24% and 6.63% for Tamil and Kannada respectively.

翻译：我们根据亚字模型为泰米尔和坎纳达提供自动语音识别系统(ASR),以有效处理因语言高度混杂性而导致的无限词汇。我们探索了字对编码(BPE),并提出了这种算法的变式,名为扩展-BPE,和Morfessor工具,将每个字作为子词进行分行。我们在这些算法中有效地纳入了与加权有限国家传输器(WFFST)的最大化可能性(ML)和维泰比估算技术。我们用152小时的数据从大文本堆中学习子字典。在所学的子字典中,培训数据笔录中的词被分割为子字组,我们培训深神经网络ASR系统,这些系统在任何特定测试演讲时都承认子词序列。然后,产出子字序列的处理后,使用确定最后字序列,这样可以识别的实际字数要大得多。我们用152小时的数据从大文本体中学习小字典字典。在Kannada ASR中,我们用275小时用于培训和72小时的深度神经系统系统系统系统系统,在测试中分别用275小时和72小时,用来测试AER最高字节级水平,然后用我们找到一个分数级的缩缩缩缩缩缩缩缩缩缩缩的计算。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

玉米转脂蛋白新成员ZmLTP3的抗盐功能及其上游调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

三株海南特有红树植物内生真菌抗炭疽病真菌的化学成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

光敏配合物的合成及其催化的C-H键官能化研究

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ调控PI3K/Akt在胰岛素抵抗中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

活性氧在糖尿病视网膜病变“#20195;谢记忆”#20013;的作用及意义

国家自然科学基金

0+阅读 · 2009年12月31日

中文医学文本中关联信息提取方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

Ensembles of Compact, Region-specific & Regularized Spiking Neural Networks for Scalable Place Recognition

Arxiv

0+阅读 · 2022年9月19日

Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings

Arxiv

0+阅读 · 2022年9月18日

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Arxiv

0+阅读 · 2022年9月17日

An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning

Arxiv

0+阅读 · 2022年9月16日

SeqOT: A Spatial-Temporal Transformer Network for Place Recognition Using Sequential LiDAR Data

Arxiv

0+阅读 · 2022年9月16日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Knowledge Graph Transfer Network for Few-Shot Recognition

Arxiv

15+阅读 · 2019年11月21日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

VIP会员

文章信息

相关主题

自动语音识别

估计/估计量

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Ensembles of Compact, Region-specific & Regularized Spiking Neural Networks for Scalable Place Recognition

Arxiv

0+阅读 · 2022年9月19日

Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings

Arxiv

0+阅读 · 2022年9月18日

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Arxiv

0+阅读 · 2022年9月17日

An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning

Arxiv

0+阅读 · 2022年9月16日

SeqOT: A Spatial-Temporal Transformer Network for Place Recognition Using Sequential LiDAR Data

Arxiv

0+阅读 · 2022年9月16日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Knowledge Graph Transfer Network for Few-Shot Recognition

Arxiv

15+阅读 · 2019年11月21日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

相关基金

玉米转脂蛋白新成员ZmLTP3的抗盐功能及其上游调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

三株海南特有红树植物内生真菌抗炭疽病真菌的化学成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

光敏配合物的合成及其催化的C-H键官能化研究

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ调控PI3K/Akt在胰岛素抵抗中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

活性氧在糖尿病视网膜病变“#20195;谢记忆”#20013;的作用及意义

国家自然科学基金

0+阅读 · 2009年12月31日

中文医学文本中关联信息提取方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员