基于大型视觉语言模型的异常思维链 (Chain-of-Anomaly Thoughts with Large Vision-Language Models) - 专知论文

会员服务 ·

0

视频 · 偏差 · 思维链 · 大型视觉语言模型 · 语言模型 ·

Chain-of-Anomaly Thoughts with Large Vision-Language Models

翻译：基于大型视觉语言模型的异常思维链

Pedro Domingos,João Pereira,Vasco Lopes,João Neves,David Semedo

from arxiv, 2 pages, 3 figures, 1 table. Accepted for RECPAD 2025

Automated video surveillance with Large Vision-Language Models is limited by their inherent bias towards normality, often failing to detect crimes. While Chain-of-Thought reasoning strategies show significant potential for improving performance in language tasks, the lack of inductive anomaly biases in their reasoning further steers the models towards normal interpretations. To address this, we propose Chain-of-Anomaly-Thoughts (CoAT), a multi-agent reasoning framework that introduces inductive criminal bias in the reasoning process through a final, anomaly-focused classification layer. Our method significantly improves Anomaly Detection, boosting F1-score by 11.8 p.p. on challenging low-resolution footage and Anomaly Classification by 3.78 p.p. in high-resolution videos.

翻译：基于大型视觉语言模型的自动化视频监控受限于其固有的正常性偏差，往往难以检测犯罪行为。虽然思维链推理策略在提升语言任务性能方面展现出显著潜力，但其推理过程中缺乏归纳性异常偏差，进一步将模型导向正常性解释。为此，我们提出异常思维链——一种多智能体推理框架，通过最终聚焦异常的分类层在推理过程中引入归纳性犯罪偏差。该方法显著提升了异常检测性能，在低分辨率监控视频上将F1分数提升11.8个百分点，在高分辨率视频中将异常分类准确率提升3.78个百分点。

0

相关内容

视频

【NVDIA】Cosmos世界基础模型平台用于物理人工智能

【NVDIA】Cosmos世界基础模型平台用于物理人工智能

专知会员服务

27+阅读 · 1月13日

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

专知会员服务

18+阅读 · 2024年4月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【ICML2021】全局思考，局部行动:高维分类和混合搜索空间上的贝叶斯优化

专知会员服务

28+阅读 · 2021年5月11日

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

专知会员服务

116+阅读 · 2020年2月10日

基于模型的强化学习综述

基于模型的强化学习综述

专知

42+阅读 · 2022年7月13日

TKDE 2020 | 面向严格冷启动推荐的属性图神经网络

TKDE 2020 | 面向严格冷启动推荐的属性图神经网络

PaperWeekly

13+阅读 · 2020年12月18日

自然语言处理中的自注意力机制（Self-Attention Mechanism）

自然语言处理中的自注意力机制（Self-Attention Mechanism）

PaperWeekly

22+阅读 · 2018年3月28日

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

专知

27+阅读 · 2018年2月24日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

协同创新团队隐性知识共享有效性的随机动态博弈分析

国家自然科学基金

4+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

Imperative Learning: A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy

Arxiv

0+阅读 · 12月24日

Reasoning-Driven Amodal Completion: Collaborative Agents and Perceptual Evaluation

Arxiv

0+阅读 · 12月24日

AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting

Arxiv

0+阅读 · 12月22日

Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks

Arxiv

0+阅读 · 12月19日

Neuro-Symbolic Control with Large Language Models for Language-Guided Spatial Tasks

Arxiv

0+阅读 · 12月19日

VIP会员

文章信息

相关主题

大型视觉语言模型

相关VIP内容

【NVDIA】Cosmos世界基础模型平台用于物理人工智能

【NVDIA】Cosmos世界基础模型平台用于物理人工智能

专知会员服务

27+阅读 · 1月13日

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

专知会员服务

18+阅读 · 2024年4月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【ICML2021】全局思考，局部行动:高维分类和混合搜索空间上的贝叶斯优化

专知会员服务

28+阅读 · 2021年5月11日

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

专知会员服务

116+阅读 · 2020年2月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【书籍】从零开始构建文本生成图像生成器：基于 Transformers 与扩散模型

人工智能与未来指挥

【伯克利博士论文】将大语言模型绑定至虚拟人格：实现人类行为模拟

稀疏自编码器综述：解释大语言模型的内部机制

相关资讯

基于模型的强化学习综述

基于模型的强化学习综述

专知

42+阅读 · 2022年7月13日

TKDE 2020 | 面向严格冷启动推荐的属性图神经网络

TKDE 2020 | 面向严格冷启动推荐的属性图神经网络

PaperWeekly

13+阅读 · 2020年12月18日

自然语言处理中的自注意力机制（Self-Attention Mechanism）

自然语言处理中的自注意力机制（Self-Attention Mechanism）

PaperWeekly

22+阅读 · 2018年3月28日

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

专知

27+阅读 · 2018年2月24日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

相关论文

Imperative Learning: A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy

Arxiv

0+阅读 · 12月24日

Reasoning-Driven Amodal Completion: Collaborative Agents and Perceptual Evaluation

Arxiv

0+阅读 · 12月24日

AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting

Arxiv

0+阅读 · 12月22日

Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks

Arxiv

0+阅读 · 12月19日

Neuro-Symbolic Control with Large Language Models for Language-Guided Spatial Tasks

Arxiv

0+阅读 · 12月19日

相关基金

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

协同创新团队隐性知识共享有效性的随机动态博弈分析

国家自然科学基金

4+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员