确定和从有记录的法院音频记录中转录司法发言 (Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio) - 专知论文

会员服务 ·

0

可辨认的 · 转录 · 错误率 · 全 · Processing（编程语言） ·

2021 年 4 月 3 日

Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio

翻译：确定和从有记录的法院音频记录中转录司法发言

Jeffrey Tumminia,Amanda Kuznecov,Sophia Tsilerides,Ilana Weinstein,Brian McFee,Michael Picheny,Aaron R. Kaufman

from arxiv, Under review for InterSpeech 2021

United States Courts make audio recordings of oral arguments available as public record, but these recordings rarely include speaker annotations. This paper addresses the Speech Audio Diarization problem, answering the question of "Who spoke when?" in the domain of judicial oral argument proceedings. We present a workflow for diarizing the speech of judges using audio recordings of oral arguments, a process we call Reference-Dependent Speaker Verification. We utilize a speech embedding network trained with the Generalized End-to-End Loss to encode speech into d-vectors and a pre-defined reference audio library based on annotated data. We find that by encoding reference audio for speakers and full arguments and computing similarity scores we achieve a 13.8% Diarization Error Rate for speakers covered by the reference audio library on a held-out test set. We evaluate our method on the Supreme Court of the United States oral arguments, accessed through the Oyez Project, and outline future work for diarizing legal proceedings. A code repository for this research is available at github.com/JeffT13/rd-diarization

翻译：美国法院将口头辩论的录音录音作为公开记录,但这些录音很少包括演讲人的说明。本文论述在司法口头辩论程序中的“谁发言时”的问题,回答“谁发言时”的问题。我们展示了使用口头辩论录音对法官讲话进行分化的工作流程,我们称之为“参考独立发言人核查”程序。我们利用在通用端到端损失中受过培训的演讲嵌入网络,将演讲编码成d-矢量器,以及根据附加说明的数据预先界定的参考音频库。我们发现,通过对演讲人进行编码参考音频,以及完整的辩论和计算相似的分数,我们达到了参考音频图书馆在悬置测试集上覆盖的发言者的13.8%的分数错误率。我们评估了美国最高法院通过Oyez项目查阅的口头辩论方法,并概述了对法律程序进行分解的未来工作。在Githhub.com/JeffT13/rd-diarization上提供了这项研究的代码库。

0

相关内容

可辨认的

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【WWW2021】通过互信息最大化的二部图嵌入

专知会员服务

44+阅读 · 2020年12月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

专知会员服务

7+阅读 · 2020年4月16日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年2月24日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

SIGIR2019 接收论文列表

SIGIR2019 接收论文列表

专知

18+阅读 · 2019年4月20日

计算机 | ISMAR 2019等国际会议信息8条

计算机 | ISMAR 2019等国际会议信息8条

Call4Papers

3+阅读 · 2019年3月5日

人工智能 | ICAPS 2019等国际会议信息3条

人工智能 | ICAPS 2019等国际会议信息3条

Call4Papers

3+阅读 · 2018年9月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能类 | 国际会议/SCI期刊专刊信息9条

人工智能类 | 国际会议/SCI期刊专刊信息9条

Call4Papers

4+阅读 · 2018年7月10日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

WAP: Digital Dependability Identities

Arxiv

0+阅读 · 2021年5月31日

ILDC for CJPE: Indian Legal Documents Corpus for Court Judgment Prediction and Explanation

Arxiv

0+阅读 · 2021年5月31日

Voice of Your Brain: Cognitive Representations of Imagined Speech,Overt Speech, and Speech Perception Based on EEG

Arxiv

0+阅读 · 2021年5月31日

Emotional Voice Conversion: Theory, Databases and ESD

Arxiv

0+阅读 · 2021年5月31日

Voice Activity Detection for Ultrasound-based Silent Speech Interfaces using Convolutional Neural Networks

Arxiv

0+阅读 · 2021年5月28日

ILDC for CJPE: Indian Legal Documents Corpus for Court JudgmentPrediction and Explanation

Arxiv

0+阅读 · 2021年5月28日

Advances in Online Audio-Visual Meeting Transcription

Advances in Online Audio-Visual Meeting Transcription

Arxiv

4+阅读 · 2019年12月10日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

Efficient end-to-end learning for quantizable representations

Arxiv

6+阅读 · 2018年5月15日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【WWW2021】通过互信息最大化的二部图嵌入

专知会员服务

44+阅读 · 2020年12月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

专知会员服务

7+阅读 · 2020年4月16日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年2月24日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

SIGIR2019 接收论文列表

SIGIR2019 接收论文列表

专知

18+阅读 · 2019年4月20日

计算机 | ISMAR 2019等国际会议信息8条

计算机 | ISMAR 2019等国际会议信息8条

Call4Papers

3+阅读 · 2019年3月5日

人工智能 | ICAPS 2019等国际会议信息3条

人工智能 | ICAPS 2019等国际会议信息3条

Call4Papers

3+阅读 · 2018年9月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能类 | 国际会议/SCI期刊专刊信息9条

人工智能类 | 国际会议/SCI期刊专刊信息9条

Call4Papers

4+阅读 · 2018年7月10日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

WAP: Digital Dependability Identities

Arxiv

0+阅读 · 2021年5月31日

ILDC for CJPE: Indian Legal Documents Corpus for Court Judgment Prediction and Explanation

Arxiv

0+阅读 · 2021年5月31日

Voice of Your Brain: Cognitive Representations of Imagined Speech,Overt Speech, and Speech Perception Based on EEG

Arxiv

0+阅读 · 2021年5月31日

Emotional Voice Conversion: Theory, Databases and ESD

Arxiv

0+阅读 · 2021年5月31日

Voice Activity Detection for Ultrasound-based Silent Speech Interfaces using Convolutional Neural Networks

Arxiv

0+阅读 · 2021年5月28日

ILDC for CJPE: Indian Legal Documents Corpus for Court JudgmentPrediction and Explanation

Arxiv

0+阅读 · 2021年5月28日

Advances in Online Audio-Visual Meeting Transcription

Advances in Online Audio-Visual Meeting Transcription

Arxiv

4+阅读 · 2019年12月10日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

Efficient end-to-end learning for quantizable representations

Arxiv

6+阅读 · 2018年5月15日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员