Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction - 专知论文

会员服务 ·

0

INTERACT · 语音识别 · 可约的 · Continuity · INFORMS ·

2023 年 5 月 21 日

Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction

翻译：暂无翻译

Mohan Shi,Yuchun Shu,Lingyun Zuo,Qian Chen,Shiliang Zhang,Jie Zhang,Li-Rong Dai

from arxiv, Accepted by Interspeech2023

For speech interaction, voice activity detection (VAD) is often used as a front-end. However, traditional VAD algorithms usually need to wait for a continuous tail silence to reach a preset maximum duration before segmentation, resulting in a large latency that affects user experience. In this paper, we propose a novel semantic VAD for low-latency segmentation. Different from existing methods, a frame-level punctuation prediction task is added to the semantic VAD, and the artificial endpoint is included in the classification category in addition to the often-used speech presence and absence. To enhance the semantic information of the model, we also incorporate an automatic speech recognition (ASR) related semantic loss. Evaluations on an internal dataset show that the proposed method can reduce the average latency by 53.3% without significant deterioration of character error rate in the back-end ASR compared to the traditional VAD approach.

翻译：暂无翻译

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

ECoG,EEG-fMRI多模态癫痫监测与病灶定位研究

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

雷公藤多苷联合小檗碱预防和治疗2型糖尿病肾小管间质病变的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

组合式海底电场传感器研制与性能测试分析研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向物理层网络编码通信的多进制LDPC码的编码调制设计

国家自然科学基金

0+阅读 · 2013年12月31日

激光陀螺高反镜缺陷反演算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

动态面孔语音情绪的整合加工及神经生理机制

国家自然科学基金

0+阅读 · 2013年12月31日

HO-1基因过表达阻断MLK3/MKK7/JNK信号通路对脊髓损伤保护作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

直接EIM检测人体肌肉阻抗的微电极阵列及应用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

CNTF激活的Ast与神经元间的对话交流在癫痫发病机制中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Unifying Multimodal Source and Propagation Graph for Rumour Detection on Social Media with Missing Features

Arxiv

0+阅读 · 2023年7月6日

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Arxiv

0+阅读 · 2023年7月6日

A Real-time Human Pose Estimation Approach for Optimal Sensor Placement in Sensor-based Human Activity Recognition

Arxiv

0+阅读 · 2023年7月6日

Noise-to-Norm Reconstruction for Industrial Anomaly Detection and Localization

Arxiv

0+阅读 · 2023年7月6日

Multi-scale Spatial-temporal Interaction Network for Video Anomaly Detection

Arxiv

0+阅读 · 2023年7月6日

Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources

Arxiv

0+阅读 · 2023年7月5日

Synthetic Data for Model Selection

Arxiv

0+阅读 · 2023年7月5日

Source Identification: A Self-Supervision Task for Dense Prediction

Arxiv

0+阅读 · 2023年7月5日

Remote Sensing Image Change Detection with Graph Interaction

Arxiv

0+阅读 · 2023年7月5日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

VIP会员

文章信息

相关主题

相关VIP内容

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Unifying Multimodal Source and Propagation Graph for Rumour Detection on Social Media with Missing Features

Arxiv

0+阅读 · 2023年7月6日

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Arxiv

0+阅读 · 2023年7月6日

A Real-time Human Pose Estimation Approach for Optimal Sensor Placement in Sensor-based Human Activity Recognition

Arxiv

0+阅读 · 2023年7月6日

Noise-to-Norm Reconstruction for Industrial Anomaly Detection and Localization

Arxiv

0+阅读 · 2023年7月6日

Multi-scale Spatial-temporal Interaction Network for Video Anomaly Detection

Arxiv

0+阅读 · 2023年7月6日

Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources

Arxiv

0+阅读 · 2023年7月5日

Synthetic Data for Model Selection

Arxiv

0+阅读 · 2023年7月5日

Source Identification: A Self-Supervision Task for Dense Prediction

Arxiv

0+阅读 · 2023年7月5日

Remote Sensing Image Change Detection with Graph Interaction

Arxiv

0+阅读 · 2023年7月5日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

相关基金

ECoG,EEG-fMRI多模态癫痫监测与病灶定位研究

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

雷公藤多苷联合小檗碱预防和治疗2型糖尿病肾小管间质病变的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

组合式海底电场传感器研制与性能测试分析研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向物理层网络编码通信的多进制LDPC码的编码调制设计

国家自然科学基金

0+阅读 · 2013年12月31日

激光陀螺高反镜缺陷反演算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

动态面孔语音情绪的整合加工及神经生理机制

国家自然科学基金

0+阅读 · 2013年12月31日

HO-1基因过表达阻断MLK3/MKK7/JNK信号通路对脊髓损伤保护作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

直接EIM检测人体肌肉阻抗的微电极阵列及应用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

CNTF激活的Ast与神经元间的对话交流在癫痫发病机制中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员