《VILP法案:人类多种形式活动整体构成基准》 (ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities) - 专知论文

会员服务 ·

0

泛化理论 · 多峰值 · Extensibility · state-of-the-art · INFORMS ·

2023 年 2 月 19 日

ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities

翻译：《VILP法案:人类多种形式活动整体构成基准》

Terry Yue Zhuo,Yaqing Liao,Yuecheng Lei,Lizhen Qu,Gerard de Melo,Xiaojun Chang,Yazhou Ren,Zenglin Xu

from arxiv, Accepted at EACL2023 (Findings)

We introduce ViLPAct, a novel vision-language benchmark for human activity planning. It is designed for a task where embodied AI agents can reason and forecast future actions of humans based on video clips about their initial activities and intents in text. The dataset consists of 2.9k videos from \charades extended with intents via crowdsourcing, a multi-choice question test set, and four strong baselines. One of the baselines implements a neurosymbolic approach based on a multi-modal knowledge base (MKB), while the other ones are deep generative models adapted from recent state-of-the-art (SOTA) methods. According to our extensive experiments, the key challenges are compositional generalization and effective use of information from both modalities.

翻译：我们引入了人类活动规划的新视觉语言基准VilPAct,这是人类活动规划的新颖的视觉语言基准。它设计用于一项任务,即包含的AI剂能够根据关于人类初始活动和文字意图的视频剪辑来解释和预测人类的未来行动。数据集由来自\ changades的2.9k视频组成,这些视频通过众包、多选择问题测试集和四个强有力的基线来扩展意图。其中一个基线在多模式知识库(MKB)的基础上采用了神经同步方法,而其他的则是根据最新的艺术状态(SOTA)方法改编的深层基因模型。根据我们的广泛实验,关键的挑战在于组成和有效利用两种模式的信息。

0

相关内容

泛化理论

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

离子与磷脂分子的特异性识别及去水化效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

燃烧过程中碰撞传能及模式选择反应问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

乳腺间质成纤维细胞在奶牛乳腺炎中的作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

Riemann-Hilbert 方法和随机矩阵谱分析中的 Painleve 渐近

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

KP初值问题的适定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

量子信用度与退相干及其在量子相变与混沌中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition

Arxiv

0+阅读 · 2023年4月11日

iQPP: A Benchmark for Image Query Performance Prediction

Arxiv

0+阅读 · 2023年4月10日

Split, Merge, and Refine: Fitting Tight Bounding Boxes via Learned Over-Segmentation and Iterative Search

Arxiv

0+阅读 · 2023年4月10日

RoboPianist: A Benchmark for High-Dimensional Robot Control

Arxiv

0+阅读 · 2023年4月9日

UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection

Arxiv

0+阅读 · 2023年4月7日

MemeFier: Dual-stage Modality Fusion for Image Meme Classification

MemeFier: Dual-stage Modality Fusion for Image Meme Classification

Arxiv

0+阅读 · 2023年4月7日

Multimodal and Explainable Internet Meme Classification

Arxiv

0+阅读 · 2023年4月7日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition

Arxiv

0+阅读 · 2023年4月11日

iQPP: A Benchmark for Image Query Performance Prediction

Arxiv

0+阅读 · 2023年4月10日

Split, Merge, and Refine: Fitting Tight Bounding Boxes via Learned Over-Segmentation and Iterative Search

Arxiv

0+阅读 · 2023年4月10日

RoboPianist: A Benchmark for High-Dimensional Robot Control

Arxiv

0+阅读 · 2023年4月9日

UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection

Arxiv

0+阅读 · 2023年4月7日

MemeFier: Dual-stage Modality Fusion for Image Meme Classification

MemeFier: Dual-stage Modality Fusion for Image Meme Classification

Arxiv

0+阅读 · 2023年4月7日

Multimodal and Explainable Internet Meme Classification

Arxiv

0+阅读 · 2023年4月7日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

相关基金

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

离子与磷脂分子的特异性识别及去水化效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

燃烧过程中碰撞传能及模式选择反应问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

乳腺间质成纤维细胞在奶牛乳腺炎中的作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

Riemann-Hilbert 方法和随机矩阵谱分析中的 Painleve 渐近

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

KP初值问题的适定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

量子信用度与退相干及其在量子相变与混沌中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员