移动电话实时光谱转换 (Real time spectrogram inversion on mobile phone) - 专知论文

会员服务 ·

0

流 · 讲稿 · 极小点 · Facebook AI Research · 可穿戴设备 ·

2022 年 7 月 28 日

Real time spectrogram inversion on mobile phone

翻译：移动电话实时光谱转换

Oleg Rybakov,Marco Tagliasacchi,Yunpeng Li,Liyang Jiang,Xia Zhang,Fadi Biadsy

We present two methods of real time magnitude spectrogram inversion: streaming Griffin Lim(GL) and streaming MelGAN. We demonstrate the impact of looking ahead on perceptual quality of MelGAN. As little as one hop size (12.5ms) of lookahead is able to significantly improve perceptual quality in comparison to its causal version. We compare streaming GL with the streaming MelGAN and show different trade-offs in terms of perceptual quality, on-device latency, algorithmic delay, memory footprint and noise sensitivity. For fair quality assessment of the GL approach, we use input log magnitude spectrogram without mel transformation. We evaluate presented real time spectrogram inversion approaches on clean, noisy and atypical speech. We specified conditions when streaming GL has comparable quality with MelGAN: noisy audio and no mel transformation. Streaming GL is 2.4x faster than real time on the ARM CPU of a Pixel4 and has a minimum memory footprint. It makes it attractive for wearable devices.

翻译：我们展示了两种实时规模光谱反转的方法:Griffin Lim(GL)流和MelGAN流。我们展示了对MelGAN感官质量向前看的影响。像头的一跳尺寸(12.5米)小于一跳尺寸(12.5米)能够大大改善感知质量,而与因果版本相比。我们将GL流与流MelGAN比较,在感知质量上显示不同的权衡取舍,在理解时显示宽度、算法延迟、记忆足迹和噪音敏感度。为了对GL方法进行公平质量评估,我们使用输入日志尺寸光谱,而不进行介质变换。我们评估在清洁、噪音和异常的语音上呈现真实时间光谱反射方法。我们指定了在流GLL具有与MelGAN相近质量时的条件:音频和无线变换。在Pixel4的ARM CPU上流GL比实际时间快2.4x, 并有最小的记忆足迹。它对于可磨的装置具有吸引力。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

氯盐在损伤和开裂混凝土中的传输机理及多尺度本构模型

国家自然科学基金

1+阅读 · 2014年12月31日

Omi/HtrA2在运动性骨骼肌损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于时空域模型分解策略的流程企业级协同优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

电子封装完整性超声无损检测与表征

国家自然科学基金

0+阅读 · 2013年12月31日

APE1在高草酸诱导肾小管上皮细胞线粒体氧化应激损伤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Hedgehog信号通路调控宫颈癌上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于点阵材料微观临界应力的几何多尺度拓扑优化方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

Arxiv

0+阅读 · 2022年9月28日

Reconstruction-guided attention improves the robustness and shape processing of neural networks

Arxiv

0+阅读 · 2022年9月27日

LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN Latent Space

Arxiv

0+阅读 · 2022年9月26日

Clustering-Based Representation Learning through Output Translation and Its Application to Remote--Sensing Images

Arxiv

0+阅读 · 2022年9月25日

MAGIC: Mask-Guided Image Synthesis by Inverting a Quasi-Robust Classifier

Arxiv

0+阅读 · 2022年9月23日

An order-theoretic perspective on modes and maximum a posteriori estimation in Bayesian inverse problems

Arxiv

0+阅读 · 2022年9月23日

Modular Degradation Simulation and Restoration for Under-Display Camera

Arxiv

0+阅读 · 2022年9月23日

Compressive Sensing with Wigner $D$-functions on Subsets of the Sphere

Arxiv

0+阅读 · 2022年9月22日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

36+阅读 · 2022年4月25日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

Facebook AI Research

可穿戴设备

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

Arxiv

0+阅读 · 2022年9月28日

Reconstruction-guided attention improves the robustness and shape processing of neural networks

Arxiv

0+阅读 · 2022年9月27日

LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN Latent Space

Arxiv

0+阅读 · 2022年9月26日

Clustering-Based Representation Learning through Output Translation and Its Application to Remote--Sensing Images

Arxiv

0+阅读 · 2022年9月25日

MAGIC: Mask-Guided Image Synthesis by Inverting a Quasi-Robust Classifier

Arxiv

0+阅读 · 2022年9月23日

An order-theoretic perspective on modes and maximum a posteriori estimation in Bayesian inverse problems

Arxiv

0+阅读 · 2022年9月23日

Modular Degradation Simulation and Restoration for Under-Display Camera

Arxiv

0+阅读 · 2022年9月23日

Compressive Sensing with Wigner $D$-functions on Subsets of the Sphere

Arxiv

0+阅读 · 2022年9月22日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

36+阅读 · 2022年4月25日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

氯盐在损伤和开裂混凝土中的传输机理及多尺度本构模型

国家自然科学基金

1+阅读 · 2014年12月31日

Omi/HtrA2在运动性骨骼肌损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于时空域模型分解策略的流程企业级协同优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

电子封装完整性超声无损检测与表征

国家自然科学基金

0+阅读 · 2013年12月31日

APE1在高草酸诱导肾小管上皮细胞线粒体氧化应激损伤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Hedgehog信号通路调控宫颈癌上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于点阵材料微观临界应力的几何多尺度拓扑优化方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员