Attention-Based Methods For Audio Question Answering - 专知论文

会员服务 ·

0

binary · MoDELS · 参考模型 · 自动问答 · 模型评估 ·

2023 年 5 月 31 日

Attention-Based Methods For Audio Question Answering

翻译：暂无翻译

Parthasaarathy Sudarsanam,Tuomas Virtanen

Audio question answering (AQA) is the task of producing natural language answers when a system is provided with audio and natural language questions. In this paper, we propose neural network architectures based on self-attention and cross-attention for the AQA task. The self-attention layers extract powerful audio and textual representations. The cross-attention maps audio features that are relevant to the textual features to produce answers. All our models are trained on the recently proposed Clotho-AQA dataset for both binary yes/no questions and single-word answer questions. Our results clearly show improvement over the reference method reported in the original paper. On the yes/no binary classification task, our proposed model achieves an accuracy of 68.3% compared to 62.7% in the reference model. For the single-word answers multiclass classifier, our model produces a top-1 and top-5 accuracy of 57.9% and 99.8% compared to 54.2% and 93.7% in the reference model respectively. We further discuss some of the challenges in the Clotho-AQA dataset such as the presence of the same answer word in multiple tenses, singular and plural forms, and the presence of specific and generic answers to the same question. We address these issues and present a revised version of the dataset.

翻译：暂无翻译

0

相关内容

binary

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

SIGIR2019 接收论文列表

SIGIR2019 接收论文列表

专知

18+阅读 · 2019年4月20日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

专知

17+阅读 · 2018年6月7日

【论文推荐】最新六篇知识图谱相关论文—全局关系嵌入、时序关系提取、对抗学习、远距离关系、时序知识图谱

【论文推荐】最新六篇知识图谱相关论文—全局关系嵌入、时序关系提取、对抗学习、远距离关系、时序知识图谱

专知

23+阅读 · 2018年4月24日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐

【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐

专知

23+阅读 · 2018年1月17日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Hedgehog-Gli1-DNMTs轴调控胰腺炎癌转化的表观遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

极地探测机器人多翼帆风力直接驱动研究

国家自然科学基金

0+阅读 · 2012年12月31日

双靶点抑制c-met和VEGFR2治疗高侵袭性肝细胞癌及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

靶向抑制Hedgehog/EGFR对胰腺癌的治疗作用及其交叉对话机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

携Herceptin和紫杉醇纳米造影剂乳腺癌分子显影和新辅助化疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

Transfer Learning and Bias Correction with Pre-trained Audio Embeddings

Arxiv

0+阅读 · 2023年7月20日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

Arxiv

10+阅读 · 2020年12月31日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering

Arxiv

16+阅读 · 2019年12月16日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

SIGIR2019 接收论文列表

SIGIR2019 接收论文列表

专知

18+阅读 · 2019年4月20日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

专知

17+阅读 · 2018年6月7日

【论文推荐】最新六篇知识图谱相关论文—全局关系嵌入、时序关系提取、对抗学习、远距离关系、时序知识图谱

【论文推荐】最新六篇知识图谱相关论文—全局关系嵌入、时序关系提取、对抗学习、远距离关系、时序知识图谱

专知

23+阅读 · 2018年4月24日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐

【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐

专知

23+阅读 · 2018年1月17日

相关论文

Transfer Learning and Bias Correction with Pre-trained Audio Embeddings

Arxiv

0+阅读 · 2023年7月20日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

Arxiv

10+阅读 · 2020年12月31日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering

Arxiv

16+阅读 · 2019年12月16日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Hedgehog-Gli1-DNMTs轴调控胰腺炎癌转化的表观遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

极地探测机器人多翼帆风力直接驱动研究

国家自然科学基金

0+阅读 · 2012年12月31日

双靶点抑制c-met和VEGFR2治疗高侵袭性肝细胞癌及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

靶向抑制Hedgehog/EGFR对胰腺癌的治疗作用及其交叉对话机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

携Herceptin和紫杉醇纳米造影剂乳腺癌分子显影和新辅助化疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员