Vide2Commosons:为丰富视频字幕制作常识描述 (Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning) - 专知论文

会员服务 ·

0

Agent · 视频描述生成（Video Caption） · 自动问答 · 讲稿 · Performer ·

2023 年 1 月 7 日

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning

翻译：Vide2Commosons:为丰富视频字幕制作常识描述

Zhiyuan Fang,Tejas Gokhale,Pratyay Banerjee,Chitta Baral,Yezhou Yang

from arxiv, EMNLP 2020. V2C Website: https://asu-apg.github.io/Video2Commonsense/

Captioning is a crucial and challenging task for video understanding. In videos that involve active agents such as humans, the agent's actions can bring about myriad changes in the scene. Observable changes such as movements, manipulations, and transformations of the objects in the scene, are reflected in conventional video captioning. Unlike images, actions in videos are also inherently linked to social aspects such as intentions (why the action is taking place), effects (what changes due to the action), and attributes that describe the agent. Thus for video understanding, such as when captioning videos or when answering questions about videos, one must have an understanding of these commonsense aspects. We present the first work on generating commonsense captions directly from videos, to describe latent aspects such as intentions, effects, and attributes. We present a new dataset "Video-to-Commonsense (V2C)" that contains $\sim9k$ videos of human agents performing various actions, annotated with 3 types of commonsense descriptions. Additionally we explore the use of open-ended video-based commonsense question answering (V2C-QA) as a way to enrich our captions. Both the generation task and the QA task can be used to enrich video captions.

翻译：视频理解是一项关键且具有挑战性的任务。在涉及像人类这样的活跃代理物的视频中,该代理物的行动可以带来众多的场景变化。常规视频字幕中反映了现场物体移动、操纵和变换等可见的变化。与图像不同, 视频中的行动也与意图( 行动为何发生)、效果( 行动带来何种变化) 和描述该代理物的属性等社会方面有着内在的联系。因此, 在视频理解方面,例如字幕视频或回答视频问题时,人们必须了解这些常识问题。我们介绍关于直接从视频生成常识说明,描述意图、效果和属性等潜在方面的首次工作。我们推出一个新的数据集“ Video- to-Commonsense (V2C) ”, 其中包含用于开展各种行动的人类代理物剂的$\sim9k$的视频, 附加3种常见描述。此外,我们探索如何使用开放式视频常见解答( V2C-QA) 来丰富我们制作的视频任务。

0

相关内容

Agent

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

矮牵牛DUF620蛋白家族基因PhADR1的功能及调控机理解析

国家自然科学基金

0+阅读 · 2015年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

冷胁迫诱导柽柳ThCAP基因表达的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β1通路相关miRNA调节癌基因trib2在肺腺癌发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

大豆AAP家族基因功能的研究

国家自然科学基金

0+阅读 · 2012年12月31日

异基因MSC抑制TNF-α促进成骨分化治疗类风湿关节炎骨质疏松的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

南方根结线虫毒性变异相关基因高通量沉默及功能验证

国家自然科学基金

0+阅读 · 2012年12月31日

RANTES参与附睾管腔酸性微环境调控的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

Detecting Human-Object Contact in Images

Arxiv

0+阅读 · 2023年3月6日

Benchmarks for Automated Commonsense Reasoning: A Survey

Arxiv

44+阅读 · 2023年2月22日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability

Arxiv

17+阅读 · 2021年6月25日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Arxiv

14+阅读 · 2018年3月14日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Arxiv

16+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

视频描述生成（Video Caption）

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

《商用大语言模型的升级风险管理：国家安全运用》

自主人工智能：未来战争是否将是自主化的？

《从装备到文化：美陆军技术素养建设启示录》最新报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Detecting Human-Object Contact in Images

Arxiv

0+阅读 · 2023年3月6日

Benchmarks for Automated Commonsense Reasoning: A Survey

Arxiv

44+阅读 · 2023年2月22日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability

Arxiv

17+阅读 · 2021年6月25日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Arxiv

14+阅读 · 2018年3月14日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Arxiv

16+阅读 · 2018年1月30日

相关基金

矮牵牛DUF620蛋白家族基因PhADR1的功能及调控机理解析

国家自然科学基金

0+阅读 · 2015年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

冷胁迫诱导柽柳ThCAP基因表达的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β1通路相关miRNA调节癌基因trib2在肺腺癌发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

大豆AAP家族基因功能的研究

国家自然科学基金

0+阅读 · 2012年12月31日

异基因MSC抑制TNF-α促进成骨分化治疗类风湿关节炎骨质疏松的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

南方根结线虫毒性变异相关基因高通量沉默及功能验证

国家自然科学基金

0+阅读 · 2012年12月31日

RANTES参与附睾管腔酸性微环境调控的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员