3D 超声音频超声波超声波接口革命神经网络 (3D Convolutional Neural Networks for Ultrasound-Based Silent Speech Interfaces) - 专知论文

会员服务 ·

0

Networking · Neural Networks · 卷积 · 卷积神经网络 · 3D ·

2021 年 4 月 23 日

3D Convolutional Neural Networks for Ultrasound-Based Silent Speech Interfaces

翻译：3D 超声音频超声波超声波接口革命神经网络

László Tóth,Amin Honarmandi Shandiz

from arxiv, 10 pages, 2 tables , 3 figures

Silent speech interfaces (SSI) aim to reconstruct the speech signal from a recording of the articulatory movement, such as an ultrasound video of the tongue. Currently, deep neural networks are the most successful technology for this task. The efficient solution requires methods that do not simply process single images, but are able to extract the tongue movement information from a sequence of video frames. One option for this is to apply recurrent neural structures such as the long short-term memory network (LSTM) in combination with 2D convolutional neural networks (CNNs). Here, we experiment with another approach that extends the CNN to perform 3D convolution, where the extra dimension corresponds to time. In particular, we apply the spatial and temporal convolutions in a decomposed form, which proved very successful recently in video action recognition. We find experimentally that our 3D network outperforms the CNN+LSTM model, indicating that 3D CNNs may be a feasible alternative to CNN+LSTM networks in SSI systems.

翻译：静音界面(SSI)旨在从脉动记录中重建语音信号,例如舌头超声波视频。目前,深神经网络是最成功的技术。高效的解决方案需要的方法不仅仅是处理单个图像,而是能够从视频框序列中提取舌头移动信息。其中一个选项是应用诸如长期短期内存网络(LSTM)等经常性神经结构,与2D脉动神经网络(CNNs)相结合。在这里,我们尝试另一种方法,将CNN扩展至3D演动,其额外维度与时间相对应。特别是,我们以分解的形式应用空间和时间演动,最近在视频动作识别中证明非常成功。我们通过实验发现,我们的3D网络超越CNN+LSTM模型,表明3DCNN可能是S系统CNN+LSTM网络的一个可行的替代方案。

0

相关内容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【ECCV2020-Oral-谷歌】对抗生成语法的人体活动预测

【ECCV2020-Oral-谷歌】对抗生成语法的人体活动预测

专知会员服务

5+阅读 · 2020年8月12日

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

专知会员服务

60+阅读 · 2020年7月14日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

专知会员服务

34+阅读 · 2020年4月30日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

135+阅读 · 2020年3月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

专知会员服务

7+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

58+阅读 · 2019年10月17日

已删除

inpluslab

8+阅读 · 2019年10月29日

CVPR2019丨首个siamese网络中训练GCNs的视觉追踪方法《Graph Convolutional Tracking》

CVPR2019丨首个siamese网络中训练GCNs的视觉追踪方法《Graph Convolutional Tracking》

极市平台

17+阅读 · 2019年10月6日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Continuous Wavelet Vocoder-based Decomposition of Parametric Speech Waveform Synthesis

Arxiv

0+阅读 · 2021年6月12日

Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces

Arxiv

0+阅读 · 2021年6月11日

ConvTransformer: A Convolutional Transformer Network for Video Frame Synthesis

Arxiv

1+阅读 · 2021年6月1日

3D U-NetR: Low Dose Computed Tomography Reconstruction via Deep Learning and 3 Dimensional Convolutions

Arxiv

0+阅读 · 2021年5月28日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

Adaptive Graph Convolutional Neural Networks

Arxiv

7+阅读 · 2018年1月10日

Interpretable Convolutional Neural Networks

Arxiv

4+阅读 · 2017年11月13日

VIP会员

文章信息

相关主题

Neural Networks

卷积神经网络

相关VIP内容

【ECCV2020-Oral-谷歌】对抗生成语法的人体活动预测

【ECCV2020-Oral-谷歌】对抗生成语法的人体活动预测

专知会员服务

5+阅读 · 2020年8月12日

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

专知会员服务

60+阅读 · 2020年7月14日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

专知会员服务

34+阅读 · 2020年4月30日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

135+阅读 · 2020年3月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

专知会员服务

7+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

58+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

量子计算在非正规战争中的新兴潜力

《生成可解释军事行动方案（COA）》

《量子信息科学与技术对国家安全的影响》最新118页

AI应用追寻系列报告（一）：AI陪伴，下一个启元

相关资讯

已删除

inpluslab

8+阅读 · 2019年10月29日

CVPR2019丨首个siamese网络中训练GCNs的视觉追踪方法《Graph Convolutional Tracking》

CVPR2019丨首个siamese网络中训练GCNs的视觉追踪方法《Graph Convolutional Tracking》

极市平台

17+阅读 · 2019年10月6日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Continuous Wavelet Vocoder-based Decomposition of Parametric Speech Waveform Synthesis

Arxiv

0+阅读 · 2021年6月12日

Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces

Arxiv

0+阅读 · 2021年6月11日

ConvTransformer: A Convolutional Transformer Network for Video Frame Synthesis

Arxiv

1+阅读 · 2021年6月1日

3D U-NetR: Low Dose Computed Tomography Reconstruction via Deep Learning and 3 Dimensional Convolutions

Arxiv

0+阅读 · 2021年5月28日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

Adaptive Graph Convolutional Neural Networks

Arxiv

7+阅读 · 2018年1月10日

Interpretable Convolutional Neural Networks

Arxiv

4+阅读 · 2017年11月13日

微信扫码咨询专知VIP会员