Quo Vadis, Anomaly Detection? LLMs and VLMs in the Spotlight - 专知论文

会员服务 ·

0

Spotlight · MoDELS · Integration · 语言模型化 · 评论员 ·

2024 年 12 月 24 日

Quo Vadis, Anomaly Detection? LLMs and VLMs in the Spotlight

翻译：暂无翻译

Xi Ding,Lei Wang

from arxiv, Research report

Video anomaly detection (VAD) has witnessed significant advancements through the integration of large language models (LLMs) and vision-language models (VLMs), addressing critical challenges such as interpretability, temporal reasoning, and generalization in dynamic, open-world scenarios. This paper presents an in-depth review of cutting-edge LLM-/VLM-based methods in 2024, focusing on four key aspects: (i) enhancing interpretability through semantic insights and textual explanations, making visual anomalies more understandable; (ii) capturing intricate temporal relationships to detect and localize dynamic anomalies across video frames; (iii) enabling few-shot and zero-shot detection to minimize reliance on large, annotated datasets; and (iv) addressing open-world and class-agnostic anomalies by using semantic understanding and motion features for spatiotemporal coherence. We highlight their potential to redefine the landscape of VAD. Additionally, we explore the synergy between visual and textual modalities offered by LLMs and VLMs, highlighting their combined strengths and proposing future directions to fully exploit the potential in enhancing video anomaly detection.

翻译：暂无翻译

0

相关内容

Spotlight

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

语义Web知识库补全关键技术研究

国家自然科学基金

14+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

Arxiv

14+阅读 · 2024年1月11日

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Arxiv

10+阅读 · 2023年11月9日

How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances

Arxiv

13+阅读 · 2023年10月11日

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Arxiv

23+阅读 · 2023年9月3日

When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities

Arxiv

19+阅读 · 2023年7月31日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Arxiv

66+阅读 · 2023年5月31日

Introspective Tips: Large Language Model for In-Context Decision Making

Arxiv

12+阅读 · 2023年5月19日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

Arxiv

14+阅读 · 2024年1月11日

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Arxiv

10+阅读 · 2023年11月9日

How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances

Arxiv

13+阅读 · 2023年10月11日

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Arxiv

23+阅读 · 2023年9月3日

When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities

Arxiv

19+阅读 · 2023年7月31日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Arxiv

66+阅读 · 2023年5月31日

Introspective Tips: Large Language Model for In-Context Decision Making

Arxiv

12+阅读 · 2023年5月19日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

语义Web知识库补全关键技术研究

国家自然科学基金

14+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员