以通用LSTM为基础的 " 最终到最后文本独立演讲人 " 核查 (Generalized LSTM-based End-to-End Text-Independent Speaker Verification) - 专知论文

会员服务 ·

0

state-of-the-art · 端到端 · 学成 · 声纹识别 · 模型评估 ·

2020 年 11 月 28 日

Generalized LSTM-based End-to-End Text-Independent Speaker Verification

翻译：以通用LSTM为基础的 " 最终到最后文本独立演讲人 " 核查

Soroosh Tayebi Arasteh

from arxiv, 7 pages, 7 tables, 6 figures. Research Internship project at the Pattern Recognition Lab at FAU Erlangen-Nuremberg, as a part of Master's curriculum. Re-implementation of the paper arXiv:1710.10467 by Wan et al

The increasing amount of available data and more affordable hardware solutions have opened a gate to the realm of Deep Learning (DL). Due to the rapid advancements and ever-growing popularity of DL, it has begun to invade almost every field, where machine learning is applicable, by altering the traditional state-of-the-art methods. While many researchers in the speaker recognition area have also started to replace the former state-of-the-art methods with DL techniques, some of the traditional i-vector-based methods are still state-of-the-art in the context of text-independent speaker verification (TI-SV). In this paper, we discuss the most recent generalized end-to-end (GE2E) DL technique based on Long Short-term Memory (LSTM) units for TI-SV by Google and compare different scenarios and aspects including utterance duration, training time, and accuracy to prove that our method outperforms the traditional methods.

翻译：越来越多的可用数据和更廉价的硬件解决方案打开了深入学习的大门。由于快速的进步和日益流行的DL,它开始通过改变传统的最先进方法,侵入几乎所有适用机器学习的领域,改变传统的最先进方法。虽然在语音识别区的许多研究人员也开始用DL技术取代以前的最先进方法,但一些传统的i-Ver-基础方法在依赖文字的发言者核查(TI-SV)方面仍然是最先进的方法。在本文件中,我们讨论了基于谷歌为TI-SV建立的长期短期内存(LSTM)单元的最近最普遍的端对端(GE2E)DL技术,并比较了不同的情景和方面,包括语音持续时间、培训时间和准确性,以证明我们的方法超越了传统方法。

0

相关内容

state-of-the-art

state-of-the-art

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

机器翻译深度学习最新综述

机器翻译深度学习最新综述

专知会员服务

99+阅读 · 2020年2月20日

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

专知会员服务

51+阅读 · 2020年2月16日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【泡泡一分钟】基于注意力机制的深度网络HydraPlus-Net(ICCV2017-34)

【泡泡一分钟】基于注意力机制的深度网络HydraPlus-Net(ICCV2017-34)

泡泡机器人SLAM

8+阅读 · 2018年6月9日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

机器学习研究会

50+阅读 · 2018年2月21日

【推荐】RNN最新研究进展综述

【推荐】RNN最新研究进展综述

机器学习研究会

26+阅读 · 2018年1月6日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition

Arxiv

2+阅读 · 2021年1月17日

Slimmable Generative Adversarial Networks

Slimmable Generative Adversarial Networks

Arxiv

3+阅读 · 2020年12月10日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition

Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition

Arxiv

7+阅读 · 2020年6月8日

Curriculum Pre-training for End-to-End Speech Translation

Arxiv

4+阅读 · 2020年4月21日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

Chinese NER Using Lattice LSTM

Arxiv

5+阅读 · 2018年5月5日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

机器翻译深度学习最新综述

机器翻译深度学习最新综述

专知会员服务

99+阅读 · 2020年2月20日

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

专知会员服务

51+阅读 · 2020年2月16日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【泡泡一分钟】基于注意力机制的深度网络HydraPlus-Net(ICCV2017-34)

【泡泡一分钟】基于注意力机制的深度网络HydraPlus-Net(ICCV2017-34)

泡泡机器人SLAM

8+阅读 · 2018年6月9日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

机器学习研究会

50+阅读 · 2018年2月21日

【推荐】RNN最新研究进展综述

【推荐】RNN最新研究进展综述

机器学习研究会

26+阅读 · 2018年1月6日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition

Arxiv

2+阅读 · 2021年1月17日

Slimmable Generative Adversarial Networks

Slimmable Generative Adversarial Networks

Arxiv

3+阅读 · 2020年12月10日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition

Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition

Arxiv

7+阅读 · 2020年6月8日

Curriculum Pre-training for End-to-End Speech Translation

Arxiv

4+阅读 · 2020年4月21日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

Chinese NER Using Lattice LSTM

Arxiv

5+阅读 · 2018年5月5日

微信扫码咨询专知VIP会员