雪山:低资源语言《圣经》录音数据集 (Snow Mountain: Dataset of Audio Recordings of The Bible in Low Resource Languages) - 专知论文

会员服务 ·

0

语音识别 · 数据集 · 自动语音识别 · MoDELS · 训练数据 ·

2022 年 6 月 1 日

Snow Mountain: Dataset of Audio Recordings of The Bible in Low Resource Languages

翻译：雪山:低资源语言《圣经》录音数据集

Kavitha Raju,Anjaly V,Ryan Lish,Joel Mathew

from arxiv, See dataset at https://gitlab.bridgeconn.com/software/research/datasets/snow-mountain

Automatic Speech Recognition (ASR) has increasing utility in the modern world. There are a many ASR models available for languages with large amounts of training data like English. However, low-resource languages are poorly represented. In response we create and release an open-licensed and formatted dataset of audio recordings of the Bible in low-resource northern Indian languages. We setup multiple experimental splits and train and analyze two competitive ASR models to serve as the baseline for future research using this data.

翻译：自动语音识别(ASR)在现代世界越来越有用,许多ASR模式可供英语等有大量培训数据的语言使用,但是,低资源语言代表不足,因此,我们创建和发行了一本公开许可和格式化的印度北部低资源语言的《圣经》录音数据集,我们设置了多种实验分解,培训并分析了两种竞争性的ASR模式,作为今后使用这些数据进行研究的基线。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CXCL12/CXCR4通过ERK和PI3K信号通路介导少突胶质前体细胞促进轴突再髓鞘化的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

IRES调控EV71神经毒性的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

西方蜜蜂工蜂泌浆遗传机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

简缩极化SAR的数据处理方法及其分类应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于平行因子分析的盲信号处理新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Style Transfer of Audio Effects with Differentiable Signal Processing

Arxiv

0+阅读 · 2022年7月18日

PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching

Arxiv

0+阅读 · 2022年7月16日

Meta-Calibration: Learning of Model Calibration Using Differentiable Expected Calibration Error

Arxiv

0+阅读 · 2022年7月15日

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Arxiv

20+阅读 · 2020年12月22日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

因果强化学习的统一框架：综述、分类体系、算法与应用

《无人机系统 - 反无人机系统：测试方法》364页

【MIT博士论文】语言模型的推理时学习算法

美军低成本无人作战攻击系统（LUCAS）：扩大无人机战争规模

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

相关论文

Style Transfer of Audio Effects with Differentiable Signal Processing

Arxiv

0+阅读 · 2022年7月18日

PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching

Arxiv

0+阅读 · 2022年7月16日

Meta-Calibration: Learning of Model Calibration Using Differentiable Expected Calibration Error

Arxiv

0+阅读 · 2022年7月15日

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Arxiv

20+阅读 · 2020年12月22日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

相关基金

CXCL12/CXCR4通过ERK和PI3K信号通路介导少突胶质前体细胞促进轴突再髓鞘化的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

IRES调控EV71神经毒性的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

西方蜜蜂工蜂泌浆遗传机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

简缩极化SAR的数据处理方法及其分类应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于平行因子分析的盲信号处理新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员