悲剧加时间:从贴有微弱标签的视频中捕捉无目的的人类活动 (Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos) - 专知论文

会员服务 ·

0

Performer · Extensibility · 视频描述生成（Video Caption） · 讲稿 · 可理解性 ·

2022 年 4 月 28 日

Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos

翻译：悲剧加时间:从贴有微弱标签的视频中捕捉无目的的人类活动

Arnav Chakravarthy,Zhiyuan Fang,Yezhou Yang

In videos that contain actions performed unintentionally, agents do not achieve their desired goals. In such videos, it is challenging for computer vision systems to understand high-level concepts such as goal-directed behavior, an ability present in humans from a very early age. Inculcating this ability in artificially intelligent agents would make them better social learners by allowing them to evaluate human action under a teleological lens. To validate the ability of deep learning models to perform this task, we curate the W-Oops dataset, built upon the Oops dataset [15]. W-Oops consists of 2,100 unintentional human action videos, with 44 goal-directed and 30 unintentional video-level activity labels collected through human annotations. Due to the expensive segment annotation procedure, we propose a weakly supervised algorithm for localizing the goal-directed as well as unintentional temporal regions in the video leveraging solely video-level labels. In particular, we employ an attention mechanism-based strategy that predicts the temporal regions which contribute the most to a classification task. Meanwhile, our designed overlap regularization allows the model to focus on distinct portions of the video for inferring the goal-directed and unintentional activity while guaranteeing their temporal ordering. Extensive quantitative experiments verify the validity of our localization method. We further conduct a video captioning experiment which demonstrates that the proposed localization module does indeed assist teleological action understanding.

翻译：在包含无意中实施的行动的视频中,代理商没有达到预期的目标。在这样的视频中,计算机视觉系统很难理解高层次的概念,例如目标导向的行为,人类从很小的时候就具备这种能力。在人工智能的代理商中注入这种能力,通过允许他们用远程视角来评估人类的行动,可以使他们更好的社会学习者。为了验证深层次学习模型执行这项任务的能力,我们根据Eos 数据集[15] 整理了W-Oops数据集。W-Oops由2 100个无意的人类行动视频组成,其中44个是目标导向的行为,30个是无意的视频层面的活动标签,通过人类的描述收集。由于成本昂贵的片段说明程序,我们提出在视频中将目标导向的和无意的时间区域本地化进行监管不力的算法,仅利用视频层面的标签。特别是,我们采用了基于关注机制的战略,预测最有助于分类任务的时段区域。同时,我们设计的重叠性规范使模型侧重于视频中的不同部分,用以推断目标导向的视频层面活动。我们提议了目标导向和无意的实验模式,同时保证了对目标导向性实验的模型的正确性活动进行。我们进一步验证。

1

相关内容

Performer

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

AG-WUS-PcG-lncRNA互作对梅多雌蕊发育的调控

国家自然科学基金

0+阅读 · 2015年12月31日

温敏二嵌段胶束种子大分子RAFT试剂调介下的种子分散RAFT聚合

国家自然科学基金

0+阅读 · 2014年12月31日

BER通路基因miRNA结合位点基因多态性与结直肠癌易感性的关联及功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR介导的脂质合成和脂质自噬在炎症促进肝脏脂质沉积中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

草原群落斑块动态对资源格局的响应机制

国家自然科学基金

0+阅读 · 2011年12月31日

MRTF-A调控CYR61介导间充质干细胞向内皮细胞分化的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

组蛋白乙酰化/去乙酰化对Myocardin诱导的心肌肥厚影响及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

TGF-β28608;活Myocardin家族诱导骨髓间充质干细胞分化的研究

国家自然科学基金

0+阅读 · 2008年12月31日

OmniMAE: Single Model Masked Pretraining on Images and Videos

OmniMAE: Single Model Masked Pretraining on Images and Videos

Arxiv

0+阅读 · 2022年6月16日

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens

Arxiv

0+阅读 · 2022年6月15日

A smile is all you need: Predicting limiting activity coefficients from SMILES with natural language processing

Arxiv

0+阅读 · 2022年6月15日

Spiking Neural Networks for Frame-based and Event-based Single Object Localization

Arxiv

0+阅读 · 2022年6月13日

Weakly-supervised segmentation using inherently-explainable classification models and their application to brain tumour classification

Weakly-supervised segmentation using inherently-explainable classification models and their application to brain tumour classification

Arxiv

0+阅读 · 2022年6月10日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

VIP会员

文章信息

相关主题

视频描述生成（Video Caption）

相关VIP内容

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

OmniMAE: Single Model Masked Pretraining on Images and Videos

OmniMAE: Single Model Masked Pretraining on Images and Videos

Arxiv

0+阅读 · 2022年6月16日

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens

Arxiv

0+阅读 · 2022年6月15日

A smile is all you need: Predicting limiting activity coefficients from SMILES with natural language processing

Arxiv

0+阅读 · 2022年6月15日

Spiking Neural Networks for Frame-based and Event-based Single Object Localization

Arxiv

0+阅读 · 2022年6月13日

Weakly-supervised segmentation using inherently-explainable classification models and their application to brain tumour classification

Weakly-supervised segmentation using inherently-explainable classification models and their application to brain tumour classification

Arxiv

0+阅读 · 2022年6月10日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

相关基金

AG-WUS-PcG-lncRNA互作对梅多雌蕊发育的调控

国家自然科学基金

0+阅读 · 2015年12月31日

温敏二嵌段胶束种子大分子RAFT试剂调介下的种子分散RAFT聚合

国家自然科学基金

0+阅读 · 2014年12月31日

BER通路基因miRNA结合位点基因多态性与结直肠癌易感性的关联及功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR介导的脂质合成和脂质自噬在炎症促进肝脏脂质沉积中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

草原群落斑块动态对资源格局的响应机制

国家自然科学基金

0+阅读 · 2011年12月31日

MRTF-A调控CYR61介导间充质干细胞向内皮细胞分化的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

组蛋白乙酰化/去乙酰化对Myocardin诱导的心肌肥厚影响及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

TGF-β28608;活Myocardin家族诱导骨髓间充质干细胞分化的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员