观看新闻:走向能够读取的视频QA模型 (Watching the News: Towards VideoQA Models that can Read) - 专知论文

会员服务 ·

0

INFORMS · 自动问答 · INTERACT · 讲稿 · MoDELS ·

2022 年 11 月 10 日

Watching the News: Towards VideoQA Models that can Read

翻译：观看新闻:走向能够读取的视频QA模型

Soumya Jahagirdar,Minesh Mathew,Dimosthenis Karatzas,C. V. Jawahar

Video Question Answering methods focus on commonsense reasoning and visual cognition of objects or persons and their interactions over time. Current VideoQA approaches ignore the textual information present in the video. Instead, we argue that textual information is complementary to the action and provides essential contextualisation cues to the reasoning process. To this end, we propose a novel VideoQA task that requires reading and understanding the text in the video. To explore this direction, we focus on news videos and require QA systems to comprehend and answer questions about the topics presented by combining visual and textual cues in the video. We introduce the ``NewsVideoQA'' dataset that comprises more than $8,600$ QA pairs on $3,000+$ news videos obtained from diverse news channels from around the world. We demonstrate the limitations of current Scene Text VQA and VideoQA methods and propose ways to incorporate scene text information into VideoQA methods.

翻译：视频问题解答方法侧重于对象或人员及其长期互动的常识推理和视觉认知。当前的视频QA 方法忽略了视频中的文本信息。相反,我们辩称文本信息是对行动的补充,为推理过程提供了基本背景提示。为此,我们提议了一部新型视频QA任务,要求阅读和理解视频中的文字。为了探索这一方向,我们侧重于新闻视频,并要求QA系统理解和回答有关主题的问题,将视频中的视觉提示和文字提示结合起来。我们介绍了“NewsVideoQA”数据集,其中包括3 000美元以上QA配对,这些配对来自世界各地的不同新闻频道。我们展示了当前Scene Text VQA和视频QA方法的局限性,并提出了将现场文本信息纳入视频QA方法的方法。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

Iotrochota属海绵及其内生菌中含氮化合物及其抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土MOF纳米荧光探针的设计合成及其生物应用

国家自然科学基金

0+阅读 · 2013年12月31日

金/氧化物杂化纳米结构可控构筑及活性氧自由基/光生载流子的形成机理与光催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土羧酸金属有机骨架膜材料的可控制备及应用探索

国家自然科学基金

0+阅读 · 2013年12月31日

Nrf2-ARE信号通路在氢气干预新生儿坏死性小肠结肠炎中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于tau蛋白代谢通路基因多态性和多模态fMRI的遗忘型轻度认知障碍神经网络机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

癌症的靶向基因 - 痘苗溶瘤病毒治疗策略

国家自然科学基金

1+阅读 · 2012年12月31日

铜基菱沸石（Cu-CHA）用于NH3选择性催化还原NOx研究

国家自然科学基金

0+阅读 · 2012年12月31日

DNA甲基化在注意缺陷多动障碍发病中的机制及和哌醋甲酯疗效的相关性

国家自然科学基金

0+阅读 · 2009年12月31日

Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

Arxiv

0+阅读 · 2023年1月6日

Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation

Arxiv

0+阅读 · 2023年1月6日

ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions

Arxiv

0+阅读 · 2023年1月5日

Test of Time: Instilling Video-Language Models with a Sense of Time

Arxiv

0+阅读 · 2023年1月5日

Learning to Search in Task and Motion Planning with Streams

Arxiv

0+阅读 · 2023年1月4日

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Approach

Arxiv

0+阅读 · 2023年1月4日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

相关论文

Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

Arxiv

0+阅读 · 2023年1月6日

Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation

Arxiv

0+阅读 · 2023年1月6日

ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions

Arxiv

0+阅读 · 2023年1月5日

Test of Time: Instilling Video-Language Models with a Sense of Time

Arxiv

0+阅读 · 2023年1月5日

Learning to Search in Task and Motion Planning with Streams

Arxiv

0+阅读 · 2023年1月4日

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Approach

Arxiv

0+阅读 · 2023年1月4日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

相关基金

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

Iotrochota属海绵及其内生菌中含氮化合物及其抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土MOF纳米荧光探针的设计合成及其生物应用

国家自然科学基金

0+阅读 · 2013年12月31日

金/氧化物杂化纳米结构可控构筑及活性氧自由基/光生载流子的形成机理与光催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土羧酸金属有机骨架膜材料的可控制备及应用探索

国家自然科学基金

0+阅读 · 2013年12月31日

Nrf2-ARE信号通路在氢气干预新生儿坏死性小肠结肠炎中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于tau蛋白代谢通路基因多态性和多模态fMRI的遗忘型轻度认知障碍神经网络机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

癌症的靶向基因 - 痘苗溶瘤病毒治疗策略

国家自然科学基金

1+阅读 · 2012年12月31日

铜基菱沸石（Cu-CHA）用于NH3选择性催化还原NOx研究

国家自然科学基金

0+阅读 · 2012年12月31日

DNA甲基化在注意缺陷多动障碍发病中的机制及和哌醋甲酯疗效的相关性

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员