VideoChat: Chat-Centric Video Understanding - 专知论文

会员服务 ·

0

可理解性 · MoDELS · HTTPS · Integration · 语言模型化 ·

2023 年 5 月 10 日

VideoChat: Chat-Centric Video Understanding

翻译：暂无翻译

KunChang Li,Yinan He,Yi Wang,Yizhuo Li,Wenhai Wang,Ping Luo,Yali Wang,Limin Wang,Yu Qiao

from arxiv, Technical report

In this study, we initiate an exploration into video understanding by introducing VideoChat, an end-to-end chat-centric video understanding system. It integrates video foundation models and large language models via a learnable neural interface, excelling in spatiotemporal reasoning, event localization, and causal relationship inference. To instructively tune this system, we propose a video-centric instruction dataset, composed of thousands of videos matched with detailed descriptions and conversations. This dataset emphasizes spatiotemporal reasoning and causal relationships, providing a valuable asset for training chat-centric video understanding systems. Preliminary qualitative experiments reveal our system's potential across a broad spectrum of video applications and set the standard for future research. Access our code and data at https://github.com/OpenGVLab/Ask-Anything

翻译：暂无翻译

0

相关内容

可理解性

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

专知

12+阅读 · 2018年2月2日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

自噬逃脱线粒体DNA介导TLRs-MyD88-NFκB信号通路在呼吸机相关性肺损伤中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

高计数率PPAC探测器前端读出电路研制

国家自然科学基金

0+阅读 · 2014年12月31日

线粒体铁蛋白在铁死亡中的保护作用

国家自然科学基金

0+阅读 · 2013年12月31日

用慢正电子束研究金属多层膜的微结构

国家自然科学基金

0+阅读 · 2013年12月31日

卵转铁蛋白对树突状细胞免疫调节作用及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

三维石墨烯导电网络制备与宽光谱太阳电池探索

国家自然科学基金

0+阅读 · 2012年12月31日

髓样抑制细胞（MDSCs）在小细胞肺癌发生、进展及治疗疗效中的生物学作用的转化性研究

国家自然科学基金

0+阅读 · 2011年12月31日

survivin拮抗细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

siRNA基因沉默与诱导双向基因治疗关节炎的软骨、滑膜生物学响应及ex vivo系统转基因在体示踪研究

国家自然科学基金

0+阅读 · 2011年12月31日

抑制Kupffer细胞HDAC11活性诱导大鼠肝移植免疫耐受

国家自然科学基金

0+阅读 · 2009年12月31日

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

Arxiv

0+阅读 · 2023年6月27日

Movie101: A New Movie Understanding Benchmark

Arxiv

0+阅读 · 2023年6月27日

Understanding In-Context Learning via Supportive Pretraining Data

Arxiv

0+阅读 · 2023年6月26日

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

Arxiv

0+阅读 · 2023年6月26日

Scaling Evidence-based Instructional Design Expertise through Large Language Models

Arxiv

0+阅读 · 2023年6月23日

First Place Solution to the CVPR'2023 AQTC Challenge: A Function-Interaction Centric Approach with Spatiotemporal Visual-Language Alignment

Arxiv

0+阅读 · 2023年6月23日

Open-Ended Multi-Modal Relational Reasoning for Video Question Answering

Arxiv

0+阅读 · 2023年6月23日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability

Arxiv

17+阅读 · 2021年6月25日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

专知

12+阅读 · 2018年2月2日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

相关论文

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

Arxiv

0+阅读 · 2023年6月27日

Movie101: A New Movie Understanding Benchmark

Arxiv

0+阅读 · 2023年6月27日

Understanding In-Context Learning via Supportive Pretraining Data

Arxiv

0+阅读 · 2023年6月26日

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

Arxiv

0+阅读 · 2023年6月26日

Scaling Evidence-based Instructional Design Expertise through Large Language Models

Arxiv

0+阅读 · 2023年6月23日

First Place Solution to the CVPR'2023 AQTC Challenge: A Function-Interaction Centric Approach with Spatiotemporal Visual-Language Alignment

Arxiv

0+阅读 · 2023年6月23日

Open-Ended Multi-Modal Relational Reasoning for Video Question Answering

Arxiv

0+阅读 · 2023年6月23日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability

Arxiv

17+阅读 · 2021年6月25日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

相关基金

自噬逃脱线粒体DNA介导TLRs-MyD88-NFκB信号通路在呼吸机相关性肺损伤中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

高计数率PPAC探测器前端读出电路研制

国家自然科学基金

0+阅读 · 2014年12月31日

线粒体铁蛋白在铁死亡中的保护作用

国家自然科学基金

0+阅读 · 2013年12月31日

用慢正电子束研究金属多层膜的微结构

国家自然科学基金

0+阅读 · 2013年12月31日

卵转铁蛋白对树突状细胞免疫调节作用及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

三维石墨烯导电网络制备与宽光谱太阳电池探索

国家自然科学基金

0+阅读 · 2012年12月31日

髓样抑制细胞（MDSCs）在小细胞肺癌发生、进展及治疗疗效中的生物学作用的转化性研究

国家自然科学基金

0+阅读 · 2011年12月31日

survivin拮抗细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

siRNA基因沉默与诱导双向基因治疗关节炎的软骨、滑膜生物学响应及ex vivo系统转基因在体示踪研究

国家自然科学基金

0+阅读 · 2011年12月31日

抑制Kupffer细胞HDAC11活性诱导大鼠肝移植免疫耐受

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员