调换状态-空间模型中的联合演讲人比对和跟踪 (Joint speaker diarisation and tracking in switching state-space model) - 专知论文

会员服务 ·

0

Performer · MoDELS · INFORMS · 隐状态 · 平稳的 ·

2021 年 9 月 23 日

Joint speaker diarisation and tracking in switching state-space model

翻译：调换状态-空间模型中的联合演讲人比对和跟踪

Jeremy H. M. Wong,Yifan Gong

Speakers may move around while diarisation is being performed. When a microphone array is used, the instantaneous locations of where the sounds originated from can be estimated, and previous investigations have shown that such information can be complementary to speaker embeddings in the diarisation task. However, these approaches often assume that speakers are fairly stationary throughout a meeting. This paper relaxes this assumption, by proposing to explicitly track the movements of speakers while jointly performing diarisation within a unified model. A state-space model is proposed, where the hidden state expresses the identity of the current active speaker and the predicted locations of all speakers. The model is implemented as a particle filter. Experiments on a Microsoft rich meeting transcription task show that the proposed joint location tracking and diarisation approach is able to perform comparably with other methods that use location information.

翻译：使用麦克风阵列时,声源的瞬时位置可以估计,而以往的调查显示,这种信息可以补充将发言者嵌入二分法的任务,但是,这些方法往往假定发言者在整个会议期间相当固定。本文放宽了这一假设,提议明确跟踪发言者的移动情况,同时在一个统一的模型内联合进行二分法。提出了州空间模型,其中隐藏状态表示当前活跃发言者的身份和所有发言者的预测位置。该模型作为粒子过滤器实施。对微软富集的会议记录处理任务进行的实验显示,拟议的联合地点跟踪和分解方法能够与使用定位信息的其他方法进行比较。

0

相关内容

Performer

中国金融科技生态白皮书，73页pdf

中国金融科技生态白皮书，73页pdf

专知会员服务

45+阅读 · 2021年10月30日

多Agent深度强化学习综述(中文版)，21页pdf

专知会员服务

116+阅读 · 2021年1月1日

多标签学习的新趋势（2020 Survey）

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

专知会员服务

76+阅读 · 2020年7月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【ECML-PKDD 2019】基于挖掘的航迹模式的在线长期航迹预测（Online long-term trajectory prediction based on mined route patterns）， Panagiotis Tampakis，Harris Georgiou

【ECML-PKDD 2019】基于挖掘的航迹模式的在线长期航迹预测（Online long-term trajectory prediction based on mined route patterns）， Panagiotis Tampakis，Harris Georgiou

专知会员服务

34+阅读 · 2019年9月16日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

计算机 | EMNLP 2019等国际会议信息6条

计算机 | EMNLP 2019等国际会议信息6条

Call4Papers

18+阅读 · 2019年4月26日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

多轮对话之对话管理：Dialog Management

多轮对话之对话管理：Dialog Management

PaperWeekly

18+阅读 · 2018年1月15日

Improving Dialogue State Tracking by Joint Slot Modeling

Arxiv

0+阅读 · 2021年11月14日

Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations

Arxiv

0+阅读 · 2021年11月12日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Learning Discriminative Model Prediction for Tracking

Learning Discriminative Model Prediction for Tracking

Arxiv

6+阅读 · 2019年4月15日

Joint entity recognition and relation extraction as a multi-head selection problem

Arxiv

3+阅读 · 2018年12月17日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Sample Efficient Adaptive Text-to-Speech

Arxiv

7+阅读 · 2018年9月27日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

VIP会员

文章信息

相关主题

相关VIP内容

中国金融科技生态白皮书，73页pdf

中国金融科技生态白皮书，73页pdf

专知会员服务

45+阅读 · 2021年10月30日

多Agent深度强化学习综述(中文版)，21页pdf

专知会员服务

116+阅读 · 2021年1月1日

多标签学习的新趋势（2020 Survey）

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

专知会员服务

76+阅读 · 2020年7月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【ECML-PKDD 2019】基于挖掘的航迹模式的在线长期航迹预测（Online long-term trajectory prediction based on mined route patterns）， Panagiotis Tampakis，Harris Georgiou

【ECML-PKDD 2019】基于挖掘的航迹模式的在线长期航迹预测（Online long-term trajectory prediction based on mined route patterns）， Panagiotis Tampakis，Harris Georgiou

专知会员服务

34+阅读 · 2019年9月16日

热门VIP内容

开通专知VIP会员享更多权益服务

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

NeurIPS 2025 | 自动化所新作速览（二）

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

计算机 | EMNLP 2019等国际会议信息6条

计算机 | EMNLP 2019等国际会议信息6条

Call4Papers

18+阅读 · 2019年4月26日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

多轮对话之对话管理：Dialog Management

多轮对话之对话管理：Dialog Management

PaperWeekly

18+阅读 · 2018年1月15日

相关论文

Improving Dialogue State Tracking by Joint Slot Modeling

Arxiv

0+阅读 · 2021年11月14日

Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations

Arxiv

0+阅读 · 2021年11月12日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Learning Discriminative Model Prediction for Tracking

Learning Discriminative Model Prediction for Tracking

Arxiv

6+阅读 · 2019年4月15日

Joint entity recognition and relation extraction as a multi-head selection problem

Arxiv

3+阅读 · 2018年12月17日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Sample Efficient Adaptive Text-to-Speech

Arxiv

7+阅读 · 2018年9月27日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

微信扫码咨询专知VIP会员