DSVAE：可解释的分离表示法用于合成语音检测 (DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection) - 专知论文

会员服务 ·

0

合成 · 语音检测 · 语音信号 · 表示 · 语音合成 ·

2023 年 4 月 6 日

DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection

翻译：DSVAE：可解释的分离表示法用于合成语音检测

Amit Kumar Singh Yadav,Kratika Bhagtani,Ziyue Xiang,Paolo Bestagini,Stefano Tubaro,Edward J. Delp

Tools to generate high quality synthetic speech signal that is perceptually indistinguishable from speech recorded from human speakers are easily available. Several approaches have been proposed for detecting synthetic speech. Many of these approaches use deep learning methods as a black box without providing reasoning for the decisions they make. This limits the interpretability of these approaches. In this paper, we propose Disentangled Spectrogram Variational Auto Encoder (DSVAE) which is a two staged trained variational autoencoder that processes spectrograms of speech using disentangled representation learning to generate interpretable representations of a speech signal for detecting synthetic speech. DSVAE also creates an activation map to highlight the spectrogram regions that discriminate synthetic and bona fide human speech signals. We evaluated the representations obtained from DSVAE using the ASVspoof2019 dataset. Our experimental results show high accuracy (>98%) on detecting synthetic speech from 6 known and 10 out of 11 unknown speech synthesizers. We also visualize the representation obtained from DSVAE for 17 different speech synthesizers and verify that they are indeed interpretable and discriminate bona fide and synthetic speech from each of the synthesizers.

翻译：工具生成高质量的合成语音信号，其在感知上与从人类发言者录制的语音无法区分，已经非常容易获取。已经提出了多种方法来检测合成语音。其中许多方法使用深度学习方法作为黑箱，而没有提供决策理由。这限制了这些方法的可解释性。在本文中，我们提出了Disentangled Spectrogram Variational Auto Encoder (DSVAE) ，该方法是一个两阶段训练的变分自动编码器，使用分离表示学习处理语音的声谱图，为检测合成语音生成了可解释的信号表示。DSVAE还创建一个激活图，以突出区分合成和真实人类语音信号的声谱图区域。我们使用ASVspoof2019数据集评估了DSVAE获得的表示方法。我们的实验结果表明，在检测6个已知和11个未知语音合成器的合成语音方面，准确率> 98％。我们还对17种不同的语音合成器获得的表示进行可视化，并验证它们确实是可解释的，能够区分从每个合成器中获得的真实和合成语音。

0

相关内容

CVPR2021最佳论文奖项出炉，德国马普智能所等获最佳论文，何恺明等获最佳论文提名

CVPR2021最佳论文奖项出炉，德国马普智能所等获最佳论文，何恺明等获最佳论文提名

专知会员服务

22+阅读 · 2021年6月22日

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

专知会员服务

63+阅读 · 2020年7月12日

【KDD2020】基于节点-边缘协同解纠缠的可解释深图生成，Interpretable Deep Graph Generation with Node-edge Co-disentanglement

【KDD2020】基于节点-边缘协同解纠缠的可解释深图生成，Interpretable Deep Graph Generation with Node-edge Co-disentanglement

专知会员服务

32+阅读 · 2020年6月11日

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

专知会员服务

17+阅读 · 2020年4月15日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

专知会员服务

16+阅读 · 2019年11月20日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

几类非线性微分方程的变分和拓扑方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

奶牛乳腺脂类合成代谢转录调控机制与基因网络构建

国家自然科学基金

0+阅读 · 2014年12月31日

Gross-Piteavskii方程组解的相分离现象研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于谱信息和时域信息增强的电子耳蜗音高增强研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

内源性竞争性RNA lincRNA-AK128346通过miR-181/CARM1通路维持人胚胎干细胞多能性的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

应用高通量测序技术研究胚胎干细胞中的转录本空间分布

国家自然科学基金

0+阅读 · 2012年12月31日

近日节律的调节机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

频域转换提高人工耳蜗植入者语音识别作用的试验研究

国家自然科学基金

0+阅读 · 2008年12月31日

An Empirical Comparison of LM-based Question and Answer Generation Methods

Arxiv

0+阅读 · 2023年5月26日

Disentangled Generation Network for Enlarged License Plate Recognition and A Unified Dataset

Arxiv

0+阅读 · 2023年5月25日

Sound Design Strategies for Latent Audio Space Explorations Using Deep Learning Architectures

Arxiv

0+阅读 · 2023年5月24日

Who Wrote this Code? Watermarking for Code Generation

Arxiv

0+阅读 · 2023年5月24日

Disentangled Representation Learning

Arxiv

16+阅读 · 2022年11月21日

Towards Large-Scale Small Object Detection: Survey and Benchmarks

Arxiv

40+阅读 · 2022年7月28日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

VIP会员

文章信息

相关主题

相关VIP内容

CVPR2021最佳论文奖项出炉，德国马普智能所等获最佳论文，何恺明等获最佳论文提名

CVPR2021最佳论文奖项出炉，德国马普智能所等获最佳论文，何恺明等获最佳论文提名

专知会员服务

22+阅读 · 2021年6月22日

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

专知会员服务

63+阅读 · 2020年7月12日

【KDD2020】基于节点-边缘协同解纠缠的可解释深图生成，Interpretable Deep Graph Generation with Node-edge Co-disentanglement

【KDD2020】基于节点-边缘协同解纠缠的可解释深图生成，Interpretable Deep Graph Generation with Node-edge Co-disentanglement

专知会员服务

32+阅读 · 2020年6月11日

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

专知会员服务

17+阅读 · 2020年4月15日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

专知会员服务

16+阅读 · 2019年11月20日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

An Empirical Comparison of LM-based Question and Answer Generation Methods

Arxiv

0+阅读 · 2023年5月26日

Disentangled Generation Network for Enlarged License Plate Recognition and A Unified Dataset

Arxiv

0+阅读 · 2023年5月25日

Sound Design Strategies for Latent Audio Space Explorations Using Deep Learning Architectures

Arxiv

0+阅读 · 2023年5月24日

Who Wrote this Code? Watermarking for Code Generation

Arxiv

0+阅读 · 2023年5月24日

Disentangled Representation Learning

Arxiv

16+阅读 · 2022年11月21日

Towards Large-Scale Small Object Detection: Survey and Benchmarks

Arxiv

40+阅读 · 2022年7月28日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

相关基金

几类非线性微分方程的变分和拓扑方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

奶牛乳腺脂类合成代谢转录调控机制与基因网络构建

国家自然科学基金

0+阅读 · 2014年12月31日

Gross-Piteavskii方程组解的相分离现象研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于谱信息和时域信息增强的电子耳蜗音高增强研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

内源性竞争性RNA lincRNA-AK128346通过miR-181/CARM1通路维持人胚胎干细胞多能性的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

应用高通量测序技术研究胚胎干细胞中的转录本空间分布

国家自然科学基金

0+阅读 · 2012年12月31日

近日节律的调节机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

频域转换提高人工耳蜗植入者语音识别作用的试验研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员