Czech语的自动语音切分工具Prak (Prak: An automatic phonetic alignment tool for Czech) - 专知论文

会员服务 ·

0

语音学 · 切分 · 工具 · 音素 · 转录 ·

2023 年 4 月 17 日

Prak: An automatic phonetic alignment tool for Czech

翻译：Czech语的自动语音切分工具Prak

Václav Hanžl,Adléta Hanžlová

from arxiv, Submitted for ICPhS 2023

Labeling speech down to the identity and time boundaries of phones is a labor-intensive part of phonetic research. To simplify this work, we created a free open-source tool generating phone sequences from Czech text and time-aligning them with audio. Low architecture complexity makes the design approachable for students of phonetics. Acoustic model ReLU NN with 56k weights was trained using PyTorch on small CommonVoice data. Alignment and variant selection decoder is implemented in Python with matrix library. A Czech pronunciation generator is composed of simple rule-based blocks capturing the logic of the language where possible, allowing modification of transcription approach details. Compared to tools used until now, data preparation efficiency improved, the tool is usable on Mac, Linux and Windows in Praat GUI or command line, achieves mostly correct pronunciation variant choice including glottal stop detection, algorithmically captures most of Czech assimilation logic and is both didactic and practical.

翻译：标注语音到音素的边界，是语音学研究中一项费力的工作。为了简化这项工作，我们创建了一个免费开源的工具，可以从Czech文本生成音素序列，并将其与音频进行时间对齐。低架构复杂度使得设计易于让语音学的学生使用。使用PyTorch在小型CommonVoice数据上训练了具有56k权重的线性整流单元神经网络。对齐和变体选择编码器采用了Python和矩阵库进行实现。当语言允许时，由一些简单的基于规则的模块组成的Czech发音生成器捕捉，并允许修改转录方法的细节。与使用到目前为止的工具相比，数据准备效率提高了，该工具可在Mac、Linux和Windows的Praat GUI或命令行中使用，实现了大多数正确的发音变体选择，包括声门塞检测，算法捕捉了大多数Czech同化逻辑，既有教学性又实用。

0

相关内容

语音学

语音学（phonetics）：语言学的语音学（linguistic phonetics），实验语音学，音法学（基础音法学、共时音法学、演化音法学）。语音学中较受公认的三大分支：发音语音学（articulatory phonetics, 肺、喉、唇舌等说者发音器官的位置、形状、动作），声学语音学（acoustic phonetics, 语音声波的频率、响度等频谱-时间的性质），听觉语音学（auditory phonetics, 听觉系统对语音如何接受、分类、识别）；三者范畴间的对应程度/协调，印象记音与仪器分析的对应程度/协调

【2022新书】Python数据分析第三版，579页pdf

【2022新书】Python数据分析第三版，579页pdf

专知会员服务

251+阅读 · 2022年8月31日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

资源 | DeepPavlov：一个训练对话系统和聊天机器人的开源库

资源 | DeepPavlov：一个训练对话系统和聊天机器人的开源库

机器之心

22+阅读 · 2018年2月27日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

时频双选水声信道下高谱效OQAM-OFDM通信的关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

PDHB基因在牛肌内前体脂肪细胞分化中的作用及其调控机制

国家自然科学基金

0+阅读 · 2013年12月31日

有限半群与半群簇

国家自然科学基金

1+阅读 · 2013年12月31日

几类半群在图论和形式语言学中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

多语言语音识别声学建模理论和容错识别新方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

降低多载波通信信号峰均功率比的LDPC码研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于采样数据的有向复杂网络牵制控制与同步

国家自然科学基金

0+阅读 · 2012年12月31日

集装箱码头同贝同步装卸的智能调度优化与干扰管理方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

汉语文语转换中语义与表现力联合建模

国家自然科学基金

0+阅读 · 2008年12月31日

StyleDrop: Text-to-Image Generation in Any Style

Arxiv

0+阅读 · 2023年6月1日

Wuerstchen: Efficient Pretraining of Text-to-Image Models

Arxiv

0+阅读 · 2023年6月1日

Query-Utterance Attention with Joint modeling for Query-Focused Meeting Summarization

Query-Utterance Attention with Joint modeling for Query-Focused Meeting Summarization

Arxiv

0+阅读 · 2023年6月1日

AfriNames: Most ASR models "butcher" African Names

Arxiv

0+阅读 · 2023年6月1日

AI Imagery and the Overton Window

Arxiv

0+阅读 · 2023年5月31日

RARR: Researching and Revising What Language Models Say, Using Language Models

RARR: Researching and Revising What Language Models Say, Using Language Models

Arxiv

0+阅读 · 2023年5月31日

Exploring Phonetic Context in Lip Movement for Authentic Talking Face Generation

Arxiv

0+阅读 · 2023年5月31日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年5月31日

Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization

Arxiv

0+阅读 · 2023年5月30日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】Python数据分析第三版，579页pdf

【2022新书】Python数据分析第三版，579页pdf

专知会员服务

251+阅读 · 2022年8月31日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能技术提升军事不确定性环境下领导决策能力研究》180页

以机器速度锁定目标：人工智能的能力与局限

中文版 | 革新国家安全：国防情报离线本地部署大语言模型

《美军21世纪医疗抵消战略》

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

资源 | DeepPavlov：一个训练对话系统和聊天机器人的开源库

资源 | DeepPavlov：一个训练对话系统和聊天机器人的开源库

机器之心

22+阅读 · 2018年2月27日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

StyleDrop: Text-to-Image Generation in Any Style

Arxiv

0+阅读 · 2023年6月1日

Wuerstchen: Efficient Pretraining of Text-to-Image Models

Arxiv

0+阅读 · 2023年6月1日

Query-Utterance Attention with Joint modeling for Query-Focused Meeting Summarization

Query-Utterance Attention with Joint modeling for Query-Focused Meeting Summarization

Arxiv

0+阅读 · 2023年6月1日

AfriNames: Most ASR models "butcher" African Names

Arxiv

0+阅读 · 2023年6月1日

AI Imagery and the Overton Window

Arxiv

0+阅读 · 2023年5月31日

RARR: Researching and Revising What Language Models Say, Using Language Models

RARR: Researching and Revising What Language Models Say, Using Language Models

Arxiv

0+阅读 · 2023年5月31日

Exploring Phonetic Context in Lip Movement for Authentic Talking Face Generation

Arxiv

0+阅读 · 2023年5月31日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年5月31日

Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization

Arxiv

0+阅读 · 2023年5月30日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

相关基金

时频双选水声信道下高谱效OQAM-OFDM通信的关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

PDHB基因在牛肌内前体脂肪细胞分化中的作用及其调控机制

国家自然科学基金

0+阅读 · 2013年12月31日

有限半群与半群簇

国家自然科学基金

1+阅读 · 2013年12月31日

几类半群在图论和形式语言学中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

多语言语音识别声学建模理论和容错识别新方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

降低多载波通信信号峰均功率比的LDPC码研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于采样数据的有向复杂网络牵制控制与同步

国家自然科学基金

0+阅读 · 2012年12月31日

集装箱码头同贝同步装卸的智能调度优化与干扰管理方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

汉语文语转换中语义与表现力联合建模

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员