MINTRec: 用于多式本体识别的新数据集 (MIntRec: A New Dataset for Multimodal Intent Recognition) - 专知论文

会员服务 ·

0

多峰值 · 模态 · INFORMS · 数据集 · Extensibility ·

2022 年 9 月 9 日

MIntRec: A New Dataset for Multimodal Intent Recognition

翻译：MINTRec: 用于多式本体识别的新数据集

Hanlei Zhang,Hua Xu,Xin Wang,Qianrui Zhou,Shaojie Zhao,Jiayan Teng

from arxiv, Accepted by ACM MM 2022 (Main Track, Long Paper)

Multimodal intent recognition is a significant task for understanding human language in real-world multimodal scenes. Most existing intent recognition methods have limitations in leveraging the multimodal information due to the restrictions of the benchmark datasets with only text information. This paper introduces a novel dataset for multimodal intent recognition (MIntRec) to address this issue. It formulates coarse-grained and fine-grained intent taxonomies based on the data collected from the TV series Superstore. The dataset consists of 2,224 high-quality samples with text, video, and audio modalities and has multimodal annotations among twenty intent categories. Furthermore, we provide annotated bounding boxes of speakers in each video segment and achieve an automatic process for speaker annotation. MIntRec is helpful for researchers to mine relationships between different modalities to enhance the capability of intent recognition. We extract features from each modality and model cross-modal interactions by adapting three powerful multimodal fusion methods to build baselines. Extensive experiments show that employing the non-verbal modalities achieves substantial improvements compared with the text-only modality, demonstrating the effectiveness of using multimodal information for intent recognition. The gap between the best-performing methods and humans indicates the challenge and importance of this task for the community. The full dataset and codes are available for use at https://github.com/thuiar/MIntRec.

翻译：在现实世界多式联运场景中,多式意向承认是理解人文的一项重要任务。由于基准数据集中只有文本信息,大多数现有意向承认方法在利用多式联运信息方面具有局限性,因为基准数据集中只有文本信息,本文件介绍了一套用于多式联运意向确认的新数据集(MIntRec),以解决这一问题。根据从电视系列《超级商店》中收集的数据,我们从每种模式中提取粗微和细微的混合意图分类;数据集由2 224个高质量的样本组成,包括文本、视频和音频模式,并在20个意向类别中提供多式说明。此外,我们在每个视频部分中提供一组附加说明的演讲者,并实现演讲者注解的自动程序。MIntRetarc有助于研究人员利用不同模式之间的关系来增强识别意向的能力。我们从每种模式中提取特征和模式的跨式互动,为此调整了三种强大的多式联运组合方法以建立基线。广泛的实验表明,采用非语言模式与只文本模式相比,取得了实质性的改进,显示了使用多式信息来确认意向的有效性。MontRetreal Reguils 和Mistratal 之间的差距表明最佳/humax

0

相关内容

多峰值

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

高指数晶面结构贵金属纳米颗粒超晶格的模板辅助自组装与光学性能

国家自然科学基金

0+阅读 · 2015年12月31日

溶剂热法FeSe基超导材料制备和物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

利用胶原蛋白的逆向生物矿化机理合成新型超材料

国家自然科学基金

0+阅读 · 2013年12月31日

高荧光水溶性稀土配位聚合物的合成及其用于Aβ纤维传感的新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

配体的原位合成制备功能无机配位聚合物材料

国家自然科学基金

0+阅读 · 2012年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

TREM-1/DAP12/ NF-κB信号通路在6-姜烯酚抗动脉粥样硬化中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

细胞和个体水平上Vaspin与胰岛素抵抗相互关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

响应性纤维素接枝共聚物的合成与性能调控

国家自然科学基金

0+阅读 · 2009年12月31日

MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

Arxiv

1+阅读 · 2022年10月20日

A Multimodal Sensor Fusion Framework Robust to Missing Modalities for Person Recognition

Arxiv

0+阅读 · 2022年10月20日

EnTDA: Entity-to-Text based Data Augmentation Approach for Named Entity Recognition Tasks

Arxiv

0+阅读 · 2022年10月19日

End-to-End Entity Detection with Proposer and Regressor

Arxiv

0+阅读 · 2022年10月19日

Performance of different machine learning methods on activity recognition and pose estimation datasets

Arxiv

0+阅读 · 2022年10月19日

Type-supervised sequence labeling based on the heterogeneous star graph for named entity recognition

Arxiv

0+阅读 · 2022年10月19日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

57+阅读 · 2021年5月24日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

Arxiv

1+阅读 · 2022年10月20日

A Multimodal Sensor Fusion Framework Robust to Missing Modalities for Person Recognition

Arxiv

0+阅读 · 2022年10月20日

EnTDA: Entity-to-Text based Data Augmentation Approach for Named Entity Recognition Tasks

Arxiv

0+阅读 · 2022年10月19日

End-to-End Entity Detection with Proposer and Regressor

Arxiv

0+阅读 · 2022年10月19日

Performance of different machine learning methods on activity recognition and pose estimation datasets

Arxiv

0+阅读 · 2022年10月19日

Type-supervised sequence labeling based on the heterogeneous star graph for named entity recognition

Arxiv

0+阅读 · 2022年10月19日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

57+阅读 · 2021年5月24日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

相关基金

高指数晶面结构贵金属纳米颗粒超晶格的模板辅助自组装与光学性能

国家自然科学基金

0+阅读 · 2015年12月31日

溶剂热法FeSe基超导材料制备和物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

利用胶原蛋白的逆向生物矿化机理合成新型超材料

国家自然科学基金

0+阅读 · 2013年12月31日

高荧光水溶性稀土配位聚合物的合成及其用于Aβ纤维传感的新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

配体的原位合成制备功能无机配位聚合物材料

国家自然科学基金

0+阅读 · 2012年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

TREM-1/DAP12/ NF-κB信号通路在6-姜烯酚抗动脉粥样硬化中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

细胞和个体水平上Vaspin与胰岛素抵抗相互关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

响应性纤维素接枝共聚物的合成与性能调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员